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Vertebrate Smoothened Proteins 

FlFrnoFTHB fNVENTTQN 

The present invention relates generally to novel Smoothened proteins which interact with 
Hedgehog and Patched signalling molecules involved in cell proliferation and differentiation. In panicuiar, 
5 the invention relates to newly identified and isolated vertebrate Smoothened proteins and DNA encoding the 
same, including rat and human Smoothened, and to various modified forms of these proteins, to venebrate 
Smoothened antibodies, and to various uses thereof. 

RACKGROIJND OF THE rNVENTION 
Development of multicellular organisms depends, at least in pan, on mechanisms which 

10 specify, direct or maintain positional information to pattern cells, tissues, or organs. Various secreted signalling 
molecules, such as members of the n*ansforming growth factor-beta ("TGF-beta"), Wnt, fibroblast growth factor 
("FGF"), and hedgehog families, have been associated with patterning activity of different cells and strucmres 
in Drosophila as well as in vertebrates (Perrimon, Cell. 80:517-520 (1995)]. 

Studies of Drosophila embryos have revealed that, at cellular blastoderm and later stages of 

1 5 development, infonmation is maintained across ceil borders by signal transduction pathways. Such pathways 
are believed to be initiated by extracellular signals like Wingless ("Wg") and Hedgehog ("Hh"). The 
exn*acellular signal, Hh, has been shown to control expression of TGF-beta, Wnt and FGF signalling molecules, 
and initiate both short-range and long-range signalling actions. A short-range action of Hh in Drosophila, for 
example, is found in the ventral epidermis, where Hh is associated with causing adjacent cells to maintain 

20 wingless (wg) expression [Perrimon, Cell. 76:781-784 (1984)]. In the vertebrate central nervous system, for 
example. Sonic hedgehog ("SHh"; a secreted venebrate homologue of dHh) is expressed in notocord cells and 
is associated with inducing floor plate formation within the adjacent neural tube in a contact-dependent manner 
[Roelink et al., £e!l, 76:761-775 (1994)]. Perrimon, £ell,S0:5 17-520 (1995) provide a general review of some 
of the long-range actions associated with Hh. 

25 Studies of the Hh protein in Drosophila ("dHh") have shown that hh encodes a 46 kDa native 

protein that is cleaved into a 39 kDa form following signal sequence cleavage and subsequently cleaved into 
a 19 kDa amino-terminal form and a 26 kDa carboxy-terminal form (Lee et aL, Science . 266 :1528-1537 
(1994)]. Lee et al. report that the 19 kDa and 26 kDa forms have different biochemical properties and are 
differentially distributed. DiNardo et al. and others have disclosed that the dHh protein triggers a signal 

30 transduction cascade that activates wg [DiNardo et al.. Nature . 222:604-609 (1988); Hidalgo and Ingham, 
Development . 110:291-301 (1990); Ingham and Hidalgo, Development . 117:283-291 (1993)] and at least 
anodier segment polarity gene, patched fptc) [Hidalgo and Ingham, supra : Tabata and Komberg, Cell . 76:89- 
102 (1994)]. Properties and characteristics of dHh are also described in reviews by Ingham et al., Curr. Opin. 
Genet. Dev. . 5:492-498 (1995) and Lumsden and Graham et al., Curr. Biol.. 1:1347-1350 (1995), Properties 

35 and characteristics of the venebrate homologue of dHh, Sonic hedgehog, are described by Echelard et al.. Cell . 
75:1417.1430 (1993); Krauss ei al., £elL 25:1431-1444 (1993); Riddle et at., CeH. 75:1401-1416 (1993); 
Johnson et aL, Cell, 72:1 165-1 173 (1994); Fan et al., CelL 81:457-465 (1995): Roberts et al.. Development . 
121 :3 1 63-3 1 74 ( 1 995); and Hynes et aL, Ceil, 80:95- 1 0 1 ( 1 995). 



-1- 



wo 98/14475 



PCT/US97/17433 



In Perrimon, CeQ, 80:5 17-520 (1995), it was reported that the biochemical mechanisms and 
receptors by which signalling moiecules like Wg and Hh regulate the activities, transcription, or both, of 
secondary signal transducers have generally not been well understood. In Drosophiia, genetic evidence 
indicates that Frizzled ("Fz") functions to transmit and transduce polarity signals in epidermal cells during hair 

5 and bristle development Fz rat homologues which have structural similarity with members of the G-protein- 
coupled receptor superfamily have been described by Chan et al., J. Biol, Cham., 267:25202-25207 (1992). 
Specifically, Chan et al. describe isolating two different cDN As from a rat cell library, the first cDN A encoding 
a predicted 64 1 residue protein, Fz- 1 , having 46% homology with Drosophila Fz, and a second cDN A encoding 
a protein, Fz-2, of 570 amino acids that is 80% homologous with Fz- 1 . Chan et al. state that mammalian^ may 

1 0 constitute a gene family important for transduction and intercellular transmission of polarity information during 
tissue morphogenesis or in differentiated tissues. Recently. Bhanot et al. did describe the identification of a 
Drosophila QQne,frizzied2 (Dfz2). and predicted Dfz2 protein, which can function as a Wg receptor in cultured 
cells [Bhanot et al., Nature . 382:225-230 (1996)]. Bhanot et ai. disclose, however, that there is no in vivo 
evidence that shows Dfz2 is required for Wg signalling. 

15 Although some evidence suggests that cellular responses to dHh are dependent on the 

transmembrane protein, smoothened (dSmo), [Nusslein-Voihard et al., Wilhclm Roux's Arch. Dev. Biol., 
192:267-282 (1984); Jurgens et al., Wilhelm Roux's Arch. Dev. Biol. . 122:283-295 (1984); Alccdo et al., CcH, 
M:22l-232 (July 26, 1996); van den Heuvcl and Ingham, Nature . 382:547-551 (August 8, 1996)], and are 
negatively regulated by the n-ansmembrane protein, "Patched" [(Hooper and Scott, Ceil, 59:751-765 (1989); 

20 Nakano et al., NaUire . 341:508-513 (1989); Hidalgo and Ingham, supra ; Ingham et al.. Nature, 153:184-187 
(1991)], the receptors for Hh proteins have not previously been biochemically characterized. Various gene 
products, including the Patched protein, the Description factor cubitus intemiptus, the serine/threonine kinase 
"fused", and the gene products o{ Cosial-l, smoothened (smo) and Suppressor of fused (Su(fuJ). have been 
implicated as putative components of the Hh signalling pathway. 

25 Prior studies in Drosophila led to the hypothesis that ptc encoded the Hh receptor [Ingham 

et al., Namre, 353:184-187 (1991)]. The activity of the /?rc product, which is a multiple membrane spanning 
ceil surface protein referred to as Patched [Hooper and Scott, supra |, represses the wg and ptc genes and is 
antagonized by the Hh signal. Patched was proposed by Ingham et al, to be a constinitively active receptor 
which is inactivated by binding of Hh, thereby permitting transcription of Hh-responsive genes. As reported 

30 by Bejsovec and Wieschaus, Development , 119:501 -517(1 993), however, Hh has effects in ptc null Drosophila 
embryos and thus cannot be the only Hh receptor. Accordingly, the role of Patched in Hh signalling has not 
been fully understood. 

Goodrich et al. have isolated a murine patched gene [Goodrich et al.. Genes Dev„ 10:301- 
312 (1996)]. Human patched homologues have also been described in recently published literature. For 
3 5 instance. Hahn et al., Cell, M:84 1-851 (1 996) describe isolation of a human homolog of Drosophila ptc. The 
gene displays up to 67% sequence identity at the nucleotide level and 60% similarity at the amino acid level 
with the Drosophila gene [Hahn et al., supra ]. Johnson et al. also provide a predicted amino acid sequence of 
a human Patched protein [Johnson etal.,Sci£nce, 222: 1668- 1671 (1996)]. Johnson ei al disclose that the 1447 
amino acid protein has 96% and 40% identity to mouse and Drosophila Patched, respectively. Tlie human and 
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mouse data from these investigators suggest that patched is a single copy gene in mammals. According to 
Hahn et aL, CeM, 85:841-851 (1996). analyses revealed the presence of three different 5' ends for their human 
pic gene. Hahn et al. postulate there may be at least three different forms of the Patched protein in mammalian 
cells: the ancestral form represented by the murine sequence, and the two human forms. Patched is further 
5 discussed in a recent review by Marigo et al.. Development . i22: 1 225 f 1 996), 

Studies in Drosophila have also led to the hypothesis that Smo could be a candidate receptor 
for Hh f Alccdo et ai., supra : van den Heuvel and Ingham, supra l. The smoothened (smoj gene was identified 
as a seement polarity gene and initially named smooth [Nusslein-Volhard et al., supra ]. Since that name 
already described another locus, though, the segment polarity gene was renamed smoothened [Lindsley and 
10 Zimm, "The Genome oi Drosophila meianogaster:' San Diego, CA:Academic Press (1992)). As first reported 
by Nusslein-Volhard et al., supra , liie smo gene is required for the maintenance of segmentation in Drosophila 
embryos. 

Alcedo et al.. supra , have recently described the cloning of the Drosophila smoothened gene 
[see also, van den Heuvel and Ingham, supra l Alcedo et al. report that hydropathy analysis predicts that the 

15 putative Smo protem is an integral membrane protein with seven membrane spanning alpha helices, a 
hydrophobic segment near the N-terminus. and a hydrophilic C-terminai tail. Thus. Smo may belong to the 
serpentine receptor family, whose members are all coupled to G protein.s. Alcedo et al., supra , also report that 
snw is necessary for Hh signalling and that it acts downstream of hh and ptc. 

As discussed in Pennisi. Science . 222:1583-1584 (1996), certain development genes are 

20 believed to play some role in cancer because they control cell growth and specialization. Recent studies 
suggest that patched is a tumor suppressor, or a gene whose loss or inactivation contributes to the excessive 
growth of cancer cells. Specifically. Hahn et al. and other investigators have found that patched is mutated in 
some common forms of basal cell carcinomas in humans [Hahn etal.. Cell, 85:841-851 (1996): Johnson etal., 
supra : Gailani et aL, in Leners. Nature Genetics . 13:September, 1996]. Hahn et al. report that alterations 

25 predicted to inactivate the patched iitnc product were found in six unrelated patients having basal cell nevus 
syndrome ("BCNS"), a familial complex of cancers and developmental abnormalities. Hahn et al. also report 
that the ptc pathway has been implicated in tumorigenesis by the cloning of the pancreatic tumor suppressor 
gene, DPC4. Vertebrate homologues of two other Drosophila segment polarity genes, the murine mammary 
^Vntl [Rijscwijk et al., CcM, ^:649 ( 1 987)] and the human glioblastoma GLI [Kinzler et al.. Science . 216:70 

30 ( ! 987)], have also been implicated in cancer. 

SUMMARY OF THE mVENTION 
Applicants have identified cDNA clones that encode novel vertebrate Smoothened proteins, 
designated herein as **vSmo." In particular, cDNA clones encoding rat Smoothened and human Smoothened 
have been identified. The vSmo proteins of the invention have surprisingly been found to be co-expressed with 

3 5 Patched proteins and to fonn physical complexes with Patched. Applicants also discovered that the vSmo alone 
did not bind Sonic hedgehog but that vertebrate Patched homologues did bind Sonic hedgehog with relatively 
high affinity. It is believed that Sonic hedgehog may mediate its biological activities through a multi-subunit 
receptor in which vSmo is a signalling component and Patched is a ligand binding component, as well as a 
ligand regulated suppressor of vSmo, Accordingly, without being limited to any one theory, pathological 
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15 



conditions, such as basal cell carcinoma, associated with inactivated (or mutated) Patched may be the result 
of constitutive activity of vSmo or vSmo signalling follo^ving from negative regulation by Patched. 

In one embodiment, the invention provides isolated vertebrate Smoothened. In panicular, 
the invention provides isolated native sequence vertebrate Smoothened. which in one embodiment, includes 
an ammo acid sequence comprising residues 1 to 793 of Figure 1 (SEQIDN0:2). The invention also provides 
isolated native sequence venebrate Smoothened which includes an amino acid sequence comprising residues 
1 to 787 of Figure 4 (SEQ ID N0:4). In other embodiments, the isolated vertebrate Smoothened comprises 
at least about 80% identity with native sequence vertebrate Smoothened compnsing residues 1 to 787 of Figure 
4 (SEQ ID N0:4). 

In another embodiment, the invention provides chimeric molecules comprising venebrate 
Smoothened ftised to a heterologous polypeptide or amino acid sequence. An example of such a chimeric 
molecule comprises a venebrate Smoothened fused to an epitope tag sequence. 

In another embodiment, the invention provides an isolated nucleic acid molecule encoding 
venebrate Smoothened. In one aspect, the nucleic acid molecule is RNA or DNA that encodes a venebrate 
Smoothened. or is complementary to such encoding nucleic acid sequence, and remams stably bound to it under 
strineent conditions. In one embodiment, the nucleic acid sequence is selected from: 

(a) the coding region ofthe nucleic acid sequence of Figure 1 (SEQ ID NO: 1) that codes for 

residue 1 to residue 793 (i.e.. nucleotides 450-452 through 2826-2828), inclusive: 

(b) the coding region ofthe nucleic acid sequence of Figure 4 (SEQ ID NO:3) that codes for 
20 residue 1 to residue 787 (i.e.. nucleotides 13-15 through 2371-2373). inclusive; or 

(c) a sequence corresponding to the sequence of (a) or (b) within the scope of degeneracy 

of the senetic code. 

In a further embodiment, the invention provides a vector comprising the nucleic acid 
molecule encoding the vertebrate Smoothened. A host cell comprising the vector or the nucleic acid molecule 
25 is also provided. A method of producing vertebrate Smoothened is further provided. 

In another embodiment, the invention provides an antibody which specifically binds to 
vertebrate Smoothened. The antibody may be an agonistic, antagonistic or neutralizing antibody. 

In another embodiment, the invemion provides non-human, transgenic or icnock-out animals. 

Another embodiment ofthe invention provides articles of manufacture and kits that include 
30 venebrate Smoothened or vertebrate Smoothened antibodies. 

A further embodiraent ofthe invemion provides protein complexes comprising vertebrate 
Smoothened protein and vertebrate Patched protein. In one embodiment the complexes further include 
venebrate Hedgehog protein. The invention also provides venebrate Patched which binds to venebrate 
Smoothened. Optionally, the vertebrate Patched comprises a sequence which is a derivative of or fragment of 

35 a native sequence venebrate Patched. 

RRIFF nFSrRIPTTON OF THE DRAWINGS 
Figure 1 shows the nucleotide (SEQ ID NO: 1) and deduced amino acid sequence (SEQ ID 
N0:2) of native sequence rat Smoothened. 
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Figure 2 shows the primary structure of rat Smo (rSmo) and Drosophila Smo (dsmo). The 
signal peptide sequences are underlined, conserved amino acids are boxed, cysteines are marked with asterisks, 
potential glycosyiation sites are marked with dashed boxes, and the seven hydrophobic transmembrane domains 
are shaded, 

5 Figure 3 shows tissue distribution of SHH, Smo and Patched in embryonic and adult rat 

tissues, in situ hybridization of SHH (left column); Smo (middle column) and Patched (right column, not 

including insets) to rat tissues. Row E15 Sag, sagittal sections through E15 rat embryos. Rows E9. ElO, E12, 

and E15. coronal sections through E9 neural folds. EIO neural tube and somites, E12 and E15 neural tube. 

Insets in Row E12 show sections through forelimb bud of E12 rat embryos. Legend- ht=hean; sk=skin; 

10 bl=bladder: ts=testes; lu=lung; to=tongue; vtc=vertebral column; nf=neural fold: nc=notocord: so-somite; 

fp=floor plate; vh=ventral horn; vz=veniricular zone; cm=cardiac mesoderm and vm=ventral midbrain. 

Figure 4 shows the nucleotide (SEQ ID N0:3) and deduced amino acid sequence (SEQ ID 

N0;4) for native sequence human Smoothened. 

Figure 5 shows the primary structure of human Smo (hSmo) and rat Smo (rat.Smo) and 

1 5 homology to Drosophila Smo (dros.smo). Conserved amino acids are boxed. 

Figure 6 illustrates the results of binding and co-immunoprecipitation assays which show 

SHH-N binds to mPatched but not to rSmo. Staining of ceils expressing the Flag tagged rSmo (a and b) or Myc 

tagged mPatched (c, d, and e) with (a) Flag (Smo) antibody; (c) Myc (mPatched) antibody; (b and d) IgG-SHH- 

N; or (e) Flag tagged SHH-N. (f) Co-immunoprecipitation of epitope tagged mPatched (Patched) or epitope 

20 tagged rSmo (Smo) with IgG-SHH-N. (g) cross-linking of '"^l-SHH-N (^^-I-SHH) to cells expressing 

mPatched or rSmo in the absence or presence of unlabeled SHH-N. (h) Co-immunoprecipitation of "-^I-SHH 

125 

by an epitope tagged mPatched (Patched) or an epitope tagged rSmo (Smo). (i) competition binding of I- 
SHH to ceils expressing mPatched or mPatched plus rSmo. 

Figure 7 illustrates (a) Double immunohistochemical staining of Patched (red) and Smo 
25 (green) in transfccted cells. Yellow indicates co-expression of the two proteins, (b and c) Detection of Patched- 
Smo Complex by immunoprecipitation. (b) immunoprecipitation with antibodies to the epitope tagged Patched 
and analysis on a Western blot with antibodies to epitope tagged Smo. (c) immunoprecipitation with antibodies 
to the epitope tagged Smo and analysis on a Western blot with antibodies to epitope tagged Patched, (d and 
e) co-immunoprecipitation of ^^^I-SHH bound to cells expressing both Smo and Patched with antibodies to 
30 either Smo (d) or Patched (e) epitope tags. 

Figure 8 shows a Western blot from a SDS-gel depicting the expression level of a wildtype 
(WT) and mutated Patched (mutant). 

Figure 9 shows a model describing the putative SHH receptor and its proposed activation 
by SHH. As shown in the model. Patched is a ligand binding component and vSmo is a signalling component 
35 in a multi-subunit SHH receptor. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 
I. pefinitrons 

The terms "vertebrate Smoothened". "vertebrate Smoothened protein" and "vSmo" when 
used herein encompass native sequence vertebrate Smoothened and vertebrate Smoothened variants (each of 
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which is defined herein). These terms encompass Smoothencd from a variety of animals classified as 
vertebrates, including mammals. In a preferred embodiment, the venebrate Smoothened is rat Smoothened 
(rSmo) or human Smoothened (hSmo). The vertebrate Smoothened may be isolated from a variety of sources, 
such as from human tissue types or from another source, or prepared by recombinant or synthetic methods. 

5 A "native sequence venebrate Smoothened" comprises a protein having the same amino acid 

sequence as a vertebrate Smoothened derived from nature. Thus, a native sequence venebrate Smoothened 
can have the ammo acid sequence of naturally occurring human Smoothened, rat Smoothened, or Smoothened 
from any other venebrate. Such native sequence venebrate Smoothened can be isolated from nature or can be 
produced by recombinant or synthetic means. The tenn "native sequence vertebrate Smoothened" specifically 

10 encompasses naturally-occurring truncated fonns of the vertebrate Smoothened. namrally-occurring variant 
fonns (e.g.. alternatively spliced fonns) and naturally-occurring allelic variants of the venebrate Smoothened. 
, -,u.^;,^»nr thP invf-nrint, the native seouence venebrate Smoothened is a mature native sequence 

Hi Wll^ viuuwwi* li*-**!. 1 - ' 

Smoothened comprising the amino acid sequence of SEQ ID N0:4. In another embodiment of the invemion, 
the native sequence vertebrate Smoothened is a mamre native sequence Smoothened comprising the amino acid 

1 5 sequence of SEQ ID N0:2. 

"Vertebrate Smoothened variant" means a vertebrate Smoothened as defined below having 
less than 100% sequence identity with vertebrate Smoothened having the deduced amino acid sequence shown 
in SEO ID N0:4 for human Smoothened or SEQ ID N0:2 for rat Smoothened. Such vertebrate Smoothened 
variams include, for instance, vertebrate Smoothened proteins wherein one or more amino acid residues are 
20 added at the N- or C-terminus of. or within, the sequences of SEQ ID N0:4 or SEQ ID N0:2: wherein about 
one to thirty amino acid residues are deleted, or optionally substituted by one or more amino acid residues; and 
derivatives thereof, wherein an amino acid residue has been covalemly modified so that the resulting product 
has a non-naturaliy occurring amino acid. Ordinarily, a venebrate Smoothened variant will have at least about 
80% sequence identity, more preferably at least about 90% sequence identity, and even more preferably at least 
25 about 95% sequence identity with the sequence of SEQ ID NO:4 or SEQ ID N0:2. 

The term "epitope tag" when used herein refers to a tag polypeptide having enough residues 
to provide an epitope against which an antibody thereagainst can be made, yet is short enough such that it does 
not interfere with activity of the vertebrate Smoothened. The tag polypeptide preferably also is fairly unique 
so that the antibody thereagainst does not substantially cross-react with other epitopes. Suitable tag 
30 polypeptides generally have at least six amino acid residues and usually between about 8-50 amino acid 
residues (preferably between about 9-30 residues). 

"Isolated," when used to describe the various proteins disclosed herein, means protein that 
has been identified and separated and/or recovered from a component of its natural environment. Contaminant 
components of its namral environment are materials that would typically interfere with diagnostic or therapeutic 
35 uses for the protein, and may include enzymes, hormones, and other proteinaceous or non-proteinaceous 
substances. In prefen-ed embodiments, the protein will be purified ( 1 ) to a degree sufficient to obtain at least 
15 residues of N-terminal or internal amino acid sequence by use of a spinning cup sequenator. or (2) to 
homogeneity by SDS-PAGE under non-reducing or reducing conditions using Coomassie blue or, preferably, 
silver stain. Isolated protein includes protein in situ within recombinant ceils, since at least one component of 
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the vSmo natural environment will not be present. Ordinarily, however, isolated protein will be prepared by 
at least one purification step. 

An "isolated'* vSmo nucleic acid molecule is a nucieic acid molecule that is identified and 
separated from at least one contaminant nucleic acid molecule with which it is ordinarily associated in the 
5 natural source of the vSmo nucleic acid. An isolated vSmo nucleic acid molecule is other than in the form or 
sening in which it is found in nature. Isolated vSmo nucleic acid molecules therefore are distinguished from 
the vSmo nucleic acid molecule as it exists in natural cells. However, an isolated vSmo nucieic acid molecule 
includes vSmo nucleic acid molecules contained in cells that ordinarily express vSmo where, for example, the 
nucleic acid molecule is in a chromosomal location different from that of natural cells. 

10 The term "conn-oi sequences" refers to DNA sequences necessary for the expression of an 

operably linked coding sequence in a particular host organism. The control sequences that are suitable for 
prokaryotes, for example, include a promoter, optionally an operator sequence, and a ribosome binding site. 
Eukaryotic cells arc known to utilize promoters, polyadenylation signals, and enhancers. 

Nucleic acid is "operably linked" when it is placed into a functional relationship with another 

!5 nucieic acid sequence. For example, DNA for a presequence or secretory leader is operably linked to DNA 
for a polypeptide if it is expressed as a preproiein that participates in the secretion of the polypeptide: a 
promoter or enhancer is operably linked to a coding sequence if it affects the transcription of the sequence; or 
a ribosome binding site is operably linked lo a coding sequence if it is positioned so as to facilitate translation. 
Generally, "operably linked'* means that the DNA sequences being linked arc contiguous, and, in the case of 

20 a secretory leader, contiguous and in reading phase. However, enhancers do not have to be contiguous. 
Linking is accomplished by ligation at convenient restriction sites. If such sites do not exist, the synthetic 
oligonucleotide adaptors or linkers are used in accordance with conventional practice. 

Tne term "antibody" is used in the broadest sense and specifically covers single anti-vSmo 
monoclonal antibodies (including agonist, antagonist, and neutralizing antibodies) and anti-vSmo antibody 

25 compositions with polyepitopic specificity. 

The term "monoclonal antibody" as used herein refers to an antibody obtained from a 
population of substantially homogeneous antibodies, i.e,, the individual antibodies comprising the population 
are identical except for possible naturally-occurring mutations that may be present in minor amounts. 
Monoclonal antibodies are highly specific, being directed against a single antigenic site. Furthermore, in 

30 contrast to conventional (polyclonal) antibody preparations which typically include different antibodies directed 
against different determinants (epitopes), each monoclonal antibody is directed against a single determinant 
on the antigen. 

The monoclonal antibodies herein include hybrid and recombinant antibodies produced by 
splicing a variable (including hypervariable) domain of an anti-vSmo antibody with a constant domain (e.g. 
35 "humanized" antibodies), or a light chain with a heavy chain, or a chain from one species with a chain from 
another species, or fusions with heterologous proteins, regardless of species of origin or immunoglobulin class 
or subclass designation, as well as antibody fragments {e.g.. Fab, F(ab')',, and Fv), so long as they exhibit the 
desired activity. See, e.g. U.S. Pat. No. 4,816,567 and Mage et al.. in Monoclonal Antibodv Production 
Techniques and Applications , pp. 79-97 (Marcel Dekker, Inc.: New York, 1987), 
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Thus, the modifier "monoclonal" indicates the character of the antibody as being obtained 
from a substantiaiiy homogeneous population of antibodies, and is not to be construed as requiring production 
of the amibody by any panicular method. For example, the monoclonal antibodies to be used in accordance 
with the present invention may be made by the hybridoma method first described by FCohler and Milstein, 
5 Nature , 256:495 (1975), or may be made by recombinant DNA methods such'as described in U.S. Pat. No. 
4,816.567. The "monoclonal antibodies'' may also be isolated from phage libraries generated using the 
techniques described in McCafferty et al.. Nature . MS:552-554 (1990), for example. 

"Humanized" forms of non-human {e.g. murine) antibodies are specific chimeric 
immunoglobulins, immunoglobulin chains, or fragments thereof (such as Fv, Fab. Fab\ F(ab')2 or other antigen- 
10 binding subsequences of antibodies) which contain minimal sequence derived from non-human 
immunoglobulin. For the most part, humanized antibodies are human immunoglobulins (recipient antibody) 
nl*»m*»nfnrv determining region fCDR^ of the recioient are replaced by residues 
from a CDR of a non-human species (donor antibody) such as mouse, rat, or rabbit having the desired 
specificity, affinity, and capacity. In some instances. Fv framework region (FR) residues of the human 
15 immunoglobulin are replaced by corresponding non-human residues. Furthermore, the humanized antibody 
may comprise residues which are found neither in the recipient antibody nor in the imported CDR or framework 
sequences. These modifications are made to further refine and optimize antibody performance. In general, 
the humanized antibody will comprise substantially all of at least one, and typically two, variable domains, in 
which all or substantiaiiy all of the CDR regions correspond to those of a non-human immunoglobulin and all 
20 or substantially all of the FR regions are those of a human immunoglobulin consensus sequence. The 
humanized antibody optimally also will comprise at least a portion of an immunoglobulin con.stant region or 
domain (Fc), typically that of a human immunoglobulin. 

The term "vertebrate" as used herein refers to any animal 
classified as a vertebrate including cenain classes of fish, reptiles, birds, and mammals. The term "mammal" 
25 as used herein refers to any animal classified as a mammal, including humans, cows, rats, mice, horses, dogs 
and cats. 

II. Modes For Carrying Out The Invention 

The present invention is based on the discovery of vertebrate homologues of Smoothened. 
In particular, Applicants have identified and isolated human and rat Smoothened. The properties and 
30 characteristics of human and rat Smoothened are described in ftirther detail in the Examples below. Based 
upon the properties and characteristics of human and rat Smoothened disclosed herein, it is Applicants' present 
belief that vertebrate Smoothened is a signalling component in a muiti-subunit Hedgehog (particulariy Sonic 
Hedgehog "SHH") receptor. 

A description follows as to how vertebrate Smoothened may be prepared. 

35 A. Preparation of vSmo 

Techniques suitable for the production of vSmo are well known in the art and include 
isolating vSmo from an endogenous source of the polypeptide, peptide synthesis (using a peptide synthesizer) 
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and recombinant techniques (or any combination of these techniques). The description below relates primarily 
10 production of vSmo by culturing ceils transformed or transfected with a vector containing vSmo nucleic acid. 
It is of course, contemplated that alternative methods, which are well known in the an. may be employed to 
prepare vSmo, 

5 I. Isolation of DNA Encoding vSmo 

The DNA encoding vSmo may be obtained from any cDNA library prepared from tissue 
believed to possess the vSmo mRNA and to express it at a detectable level. Accordingly, human Smo DNA 
can be conveniently obtained from a cDNA library prepared from human tissues, such as the library of human 
embryonic lung cDNA described in Example 3. Rat Smo DNA can be conveniently obtained from a cDNA 
10 library prepared from rat tissues, such as described in Example L The vSmo-encoding gene may also be 
obtained from a genomic library or by oiigonucieotide synthesis. 

Libraries can be screened with probes (such as antibodies to the vSmo or oligonucleotides 
or polypeptides as described in the Examples) designed to identify the gene of interest or the protein encoded 
by it. The probes are preferably labeled such that they can be detected upon hybridization to DNA in the 
1 5 library being screened. Methods of labeling are well known in the art. and include the use of radioiabels like 
^•^P-labeled ATP. biotinylation or enzyme labeling. Screening the cDNA or genomic library with a selected 
probe may be conducted using standard procedures, such as described in Sambrook et al.. Molecular Cloning: 
A Laboraton^ Manual (New York; Cold Spring Harbor Laboratory Press, 1989). An alternative means to 
isolate the gene encoding vSmo is to use PGR methodology [Sambrook et aL, supra : Dieffenbach et al„ E£E 
20 PrimenA Laboratory Manual (Cold Spring Harbor Laboratory Press. 1995)]. 

Nucleic acid having all the protein coding sequence may be obtained by screening selected 
cDNA or genomic libraries using the deduced amino acid sequences disclosed herein, and, if necessary, using 
conventional primer extension procedures as described in Sambrook et aL, supra , to detect precursors and 
processing intermediates of mRNA that may not have been reverse-transcribed into cDNA. 
25 vSmo variants can be prepared by introducing appropriate nucleotide changes into the vSmo 

DNA. or by synthesis of the desired vSmo polypeptide. Those skilled in the art will appreciate that amino acid 
changes (compared to native sequence vSmo) may alter post-translaiional processes of the vSmo. such as 
changing the number or position of glycosylation sites. 

Variations in the native sequence vSmo can be made using any of the techniques and 
30 guidelines for conservative and non-conservative mutations set forth in U.S. Pat. No. 5.364,934. These include 
oligonucleotide-mediated (site-directed) mutagenesis, alanine scanning, and PGR mutagenesis. 
2. Insertion of Nucleic Acid into A Replicable Vector 

The nucleic acid {e.g., cDNA or genomic DNA) encoding vSmo may be insened into a 
replicable vector for further cloning (amplification of the DNA) or for expression. Various vectors are publicly 
35 available. Tlie vector components generally include, but are not limited to, one or more of the following: a 
signal sequence, an origin of replication, one or more marker genes, an enhancer clement, a promoter, and a 
transcription termination sequence, each of which is described below. 
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(i) Si( >nal Seaup nrp Cnmnonent 

The vSmo may be produced recombinantly not only directly, but also as a fusion polypeptide 
withaheterologous amino acid sequence orpolypepdde. which may beasignal sequence or other polypeptide 

having a specific cleavage site at the N-terminus of the mature protein or polypeptide. In general, the signal 
sequence may be a component of the vector, or it may be a part of the vSmo DNA that is inserted into the 
vector. The heterologous signal sequence selected preferably is one that is recognized and processed (i.e., 
cleaved by a signal peptidase) by the host cell. 

(ii) nrioin of Rpplicatinn C omponent 

Both expression and cloning vectors contain a nucleic acid sequence that enables the vector 
to replicate in one or more selected host cells. Generally, in cloning vectors this sequence is one that enables 
the vector to replicate independently of the host chromosomal DNA. and includes origins of replication or 
autonomousiy replicating sequences. Such sequences are well known for a variety of bacteria, yeast, and 
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viruses. 



Most expression vectors are "shuttle" vectors, lc, they are capable of replication in at least 
1 5 one class of organisms but can be transfected into another organism for expression. For example, a vector is 
cloned in £. col, and then the same vector is transfected into yeast or mammalian cells for expression even 
though it is not capable of replicating independemly of the host cell chromosome. 

DNA may also be amplified by insertion into the host genome. Tnis is readily accomplished 
using Bacillus species as hosts, for example, by including in the vector a DNA sequence that is complementary 
20 to a sequence found in Bacllus genomic DNA. Transfection of Baallus with this vector results in homologous 
recombination with the genome and insertion of vSmo DNA. 

(iii) Selection G ene Component 
Expression and cloning vectors typically contain a selection gene, also termed a selectable 
marker. This gene encodes a protein necessary for the survival or growth of transformed host cells grown in 
a selective culture medium. Host cells not transformed with the vector containing the selection gene will not 
survive in the culture medium. Typical selection genes encode proteins that (a) confer resistance to antibiotics 
or other toxins, e.g., ampicillin. neomycin, methotrexate, or tetracycline, (b) complement auxotrophic 
deficiencies, or (c) supply critical nutrients not available from complex media, e.g., the gene encoding D- 

alanine racemase for Bacilli. 

One example of a selection scheme utilizes a drug to arrest growth of a host cell. Those cells 
that are successfiilly transformed with a heterologous gene produce a protein conferring drug resistance and 
thus survive the selection regimen. Examples of such dominant selection use the drugs neomycin [Southern 
ei al., I Mnlen Annl. Genet.. 1:327 (1982)], mycophenolic acid (Mulligan et al.. Science, 209: 1422 (1980)] 
or hygromycin [Sugden et al.. Mol. Cell. Biol. . 5:410-413 (1985)]. The three examples given above employ 
35 bacterial genes under eukaryotic control to convey resistance to the appropriate drug G418 or neomycin 
(geneticin), xgpt (mycophenolic acid), or hygromycin, respectively. 

Another example of suitable selectable markers for mammalian cells are those that enable 
the idemification of cells competent to take up the vSmo nucleic acid, such as DHFR or thymidine kinase. The 
mammalian cell transformants are placed under selection pressure that only the iransformants are uniquely 
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adapted to survive by vinue of having taken up the marker. Selection pressure is imposed by cuituring the 
transformams under conditions in which the concentration of selection agent in the medium is successively 
changed, thereby leading to amplification of both the selection gene and the DNA that encodes vSmo. 
Amplification is the process by which genes in greater demand for the production of a protein critical for 
5 growth are reiterated in tandem within the chromosomes of successive generations of recombinant cells. 

Cells transformed with the DHFR selection gene may first be identified by cuituring all of 
the transformants in a culture medium that contains methotrexate (Mtx). a competitive antagonist of DHFR. 
An appropriate host cell when wild-type DHFR is employed is the Chinese hamster ovary (CHO) cell line 
deficient in DHFR activity, prepared and propagated as described by Uriaub et al., Proc. Natl. Acad. Sci. USA . 
1 0 77:42 1 6 (1980). The transformed cells are then exposed to increased levels of methon-exate. This leads to the 
synthesis of multiple copies of the DHFR gene, and, concomitantly, multiple copies of other DNA comprising 
the expression vectors, such as the DNA encoding vSmo. 

(iv) Promoter Component 
Expression and cloning vectors usually contain a promoter that is recognized by the host 
15 organism and is operably linked to the vSmo nucleic acid sequence. Promoters are untranslated sequences 
located upstream (5') to the start codon of a structural gene (generally within about 100 to 1000 bp) that control 
the transcription and translation of panicuiar nucleic acid sequence, such as the vSmo nucleic acid sequence, 
to which they are operably linked. Such promoters typically fall into two classes, inducible and constitutive. 
Inducible promoters are promoters that initiate increased levels of transcription from DNA under their control 
20 in response to some change in culture conditions, e.g., the presence or absence of a nutrient or a change in 
temperature. At this time a large number of promoters recognized by a variety of potential host cells are well 
known. These promoters are operably linked to vSmo encoding DNA by removing the promoter from the 
source DNA by restriction enzyme digestion and inserting the isolated promoter sequence into the vector. 

Promoters suitable for use with prokaryotic hosts include the p -lactamase and lactose 
25 promoter systems [Chang ei al.. Nature . 275 :6 15 (1 978); Goedde! et al.. Nature. 2Si:544 ( 1 979)]. alkaline 
phosphatase, a tryptophan (trp) promoter system [Goeddel, Nucleic Acids Res. , 8:4057 (1980); EP 36,776], 
and hybrid promoters such as the tac promoter [deBoer et al., Proc. Natl. Acad. Sci. USA , 80:2 1-25 (1983)]. 

Promoter sequences are known for eukaryotes. Virtually all eukarv'otic genes have an AT- 
rich region located approximately 25 to 30 bases upstream from the site where transcription is initiated. 
30 Another sequence found 70 to 80 bases upstream from the start of transcription of many genes is a CXCAAT 
region where X may be any nucleotide. At the 3' end of most eukaryotic genes is an A ATAA A sequence that 
may be the signal for addition of the poly A tail to the 3' end of the coding sequence. All of these sequences 
are suitably inserted into eukaryotic expression vectors. 

Examples of suitable promoting sequences for use with yeast hosts include the promoters for 
35 3-phosphoglycerate kinase [Hilzeman el al., J. Biol. Chem. . 251:2073 (1980)] or other glycolytic enzymes 
[Hess et al., J. Adv. Enzvme Reg. . 7:149 (1968); Holland, Biochemistrv . 17:4900 (1978)]. such as enolase, 
glyceraldehyde-3-phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofi^ctokinase, 
glucose-6-phosphate isomerase, 3-phosphoglycerate muiase, pyruvate kinase, triosephosphate isomerase, 
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phosphogiucose isomerase, and giucokinase. Suitable vectors and promoters for use in yeast expression arc 
further described in EP 73,657, 

vSmo transcription from vectors in mammalian host cells is controlled, for example, by 
promoters obtained from the genomes of viruses such as polyoma virus, fowlpox virus (UK 2,211,504 
5 published 5 July 1989), adenovirus (such as Adenovirus 2), bovine papilloma virus, avian sarcoma virus, 
cytomegalovirus, a retrovirus, hepatitis-B virus and most preferably Simian Virus 40 (SV40), from 
heterologous mammalian promoters, e.g., the actin promoter or an immunoglobulin promoter. 

The early and late promoters of the SV40 virus are conveniently obtained as an SV40 
restriction fragment that also contains the S V40 viral origin of replication [Fiers et aL, Nature . 221: 113 (1 978); 

10 Mulligan and Berg, Science. 222: H22- 1427 (1980); Pavlakis et al., Proc. Natl. Acad. Sci. USA. 78:7398-7402 
(1981)]. The immediate early promoter of the human cytomegalovirus is conveniently obtained as a Hindlll 
E restriction fragment [Greenaway et al.. Gene . 18:355-360 (1982)]. A system for expressing DNA in 
mammalian hosts using the bovine papilloma virus as a vector is disclosed in U.S. Patent No. 4,419,446. A 
modification of this system is described in U.S. Patent No. 4,601,978 [See also Gray et ai.. Nature. 225:503- 

1 5 508 (1982) on expressing cDNA encoding immune interferon in monkey cells; Reyes et al.. Nature. 222:598- 
601 (1982) on expression of human p -interferon cDNA in mouse ceils under the conn-oi of a thymidine kinase 
promoter from herpes simplex virus; Canaani and Berg, Proc. Natl. Acad. Sci. USA 79:5 1 66-5 1 70 (1982) on 
expression of the human interferon P 1 gene in cultured mouse and rabbit cells; and Gorman et al., Proc. Natl. 
Acad. Sci. USA . 79:6777-6781 (1982) on expression of bacterial CAT sequences in C V- i monkey kidney cells, 

20 chicken embryo fibroblasts, Chinese hamster ovary cells; HeLa cells, and mouse N1H-3T3 cells using the Rous 
sarcoma virus long terminal repeat as a promoter], 

(v) Enhancer Element Component 

Transcription of a DNA encoding the vSmo by higher eukaryotes may be increased by 
insening an enhancer sequence into the vector. Enhancers are cis-acting elements of DNA, usually about from 

25 10 to 300 bp, that act on a promoter to increase its transcription. Enhancers are relatively orientation and 
position independent, having been found 5' [Laimins etal., Proc. Natl. Acad. Sci. USA . 28:993 (1981]) and 
3' [Lusky et al.. Mol. Cell Bio.. 3:1108(1 9831) to the transcription unit, within an intron [Banerji et al., Cell . 
33:729 (1983)], as well as within the coding sequence itself [Osborne et aL, Mol. Cell Bio.. 4:1293 (1984)]. 
Many enhancer sequences are now known from mammalian genes (globin, elastase. albumin, a -fetoprotein, 

30 and insulin). Typically, however, one will use an enhancer from a eukaryotic cell virus. Examples include the 
SV40 enhancer on the late side of the replication origin (bp 100-270), the cytomegalovirus early promoter 
enhancer, the polyoma enhancer on the late side of the replication origin, and adenovirus enhancers. See also 
Yaniv, Nature . 222:17-18 (1982) on enhancing elements for activation of eukaryotic promoters. 

(vi) Transcription Termination Component 

35 Expression vectors used in eukaryotic host cells (yeast, fungi, insect, plant animal, human, 

or nucleated cells from other multicellular organisms) will also typically contain sequences necessary for the 
termination of n^nscription and for stabilizing the mRNA. Such sequences are commonly available from the 
5' and. occasionally 3', untranslated regions of eukaryotic or viral DNAs or cDNAs. These regions contain 
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nucleotide segments transcribed as polyadenyiated fragments in the unn-anslated ponion of the mRNA encoding 
vSmo. 

(vii) Construction and Anal ysis of Vectors 

Construction of suitable vectors containing one or more of the above-listed components 
5 employs standard ligation techniques. Isolated plasmids or UNA fragments are cleaved, tailored, and re-ligated 
in the fomn desired to generate the plasmids required. 

For analysis to confirm correct sequences in plasmids constructed, the ligation mixtures can 
be used to transform £. co// KI2 strain 294 (ATCC 3 1,446) and successful transformants selected by ampicillin 
or tetracycline resistance where appropriate. Plasmids from the transformants are prepared, analyzed by 
]0 restriction endonuclease digestion, and/or sequenced by the method of Messing et al., Nucleic Acids Res., 
9:309 (1981) or by the method of Maxam et aL, Methods in Enzvmology . 65:499 (1980). 

(viii) Transient Expression Vectors 

Expression vectors that provide for the transient expression in mammalian cells of DNA 
encoding vSmo may be employed. In general, transient expression involves the use of an expression vector 
i5 that is able to replicate efficiently in a host cell, such that the host cell accumulates many copies of the 
expression vector and, in turn, synthesizes high levels of a desired polypeptide encoded by the expression 
vector [Sambrook et al., supra]. Transient expression systems, comprising a suitable expression vector and 
a host cell, allow tor the convenient positive identification of polypeptides encoded by cloned DN As. as well 
as for the rapid screening of such polypeptides for desired properties, 
20 Suitable Exemplarv Vertebrate Cell Vectors 

Other methods, vectors, and host ceils suitable for adaptation to the synthesis of vSmo in 
recombinant venebrate cell culture are described in Gething et al.. Nature, 291:620-625 (1981); Maniei et aL, 
Nature . 281:40-46 (1979): EP 117,060; and EP 117,058. 

3. Selection and Transformation of Host Cells 
25 Suitable host cells for cloning or expressing the DNA in the vectors herein arc the prokaryote, 

yeast, or higher eukaryote cells described above. Suitable prokaryotcs for this purpose include but are not 
limited to eubacteria, such as Gram-negative or Gram-positive organisms, for example, Enterobacteriaceae such 
as Escherichia, Preferably, the host cell should secrete minimal amounts of proteolytic enzymes. 

In addition to prokaryotes. eukaryotic microbes such as filamentous fungi or yeast may be 
30 suitable cloning or expression hosts for vS mo-encoding vectors. Saccharomyces cerevisiae. or common baker's 
yeast, is the most commonly used among lower eukaryotic host microorganisms. However, a number of other 
genera, species, and strains are commonly available and useful herein. 

Suitable host cells for the expression of glycosylated vSmo are derived from multicellular 
organisms. Such host cells are capable of complex processing and glycosylalion activifies. In principle, any 
35 higher eukaryotic cell culture is workable, whether from vertebrate or invertebrate culture. Examples of 
invertebrate cells include plant and insect cells. 

Propagation of vertebrate cells in culture (tissue culture) is also well known in the art [See, 
e.g., Tissue Culmre , Academic Press, iCruse and Panerson, editors (1973)]. Examples of useful mammalian 
host cell lines are monkey kidney CV1 line u^nsformed by SV40 (COS-7, ATCC CRL 1651); human 
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embryonic kidney line (293 or 293 cells subcloned for growth in suspension culture. Graham et al., LQcn 
Virol.. 26:59 (1977)); baby hamster kidney cells (BHK, ATCC CCL 10); Chinese hamster ovary ceils/-DHFR 
(CHO, Uriaub and Chasin, Proc. -Natl. Acad. Sci. USA . 77:4216 (1980)): mouse senoli cells (TM4. Mather, 
Riol. Reprod. . 23:243-251 (1980)); monkey kidney cells (CVl ATCC CCL 70); African green monkey kidney 

5 cells (VERO-76, ATCC CRL-15S7): human cervical carcinoma cells (HELA, ATCC CCL 2); canine kidney 
ceils (MDCK, ATCC CCL 34); buffalo rat liver cells (BRL 3A, ATCC CRL 1442); human lung cells (W138, 
ATCC CCL 75); human liver cells (Hep G2, HB 8065); mouse mammary tumor (MMT 060562, ATCC 
CCL5 1); TRI cells (Mather et al.. Annals N.Y. Acad. Sci. . 383:44-68 (1982)); MRC 5 ceils; and FS4 cells. 

Host cells are transfected and preferably transformed with the above-described expression 

1 0 or cloning vectors for vSmo production and cultured in conventional nutrient media modified as appropriate 
for inducing promoters, selecting transformants, or amplifying the genes encoding the desired sequences. 

Transfection refers to the taking up of an expression vector by a host cell whether or not any 
codins sequences are in fact expressed. Numerous methods of transfection are known to the ordinarily skilled 
anisan. for example, CaP04 and electroporation. Successful transfection is generally recognized when any 

1 5 indication of the operation of this vector occurs within the host cell. 

Transformation means introducing DNA into an organism so that the DNA is replicable, 
either as an exn-achromosomal element or by chromosomal integrant. Depending on the host cell used, 
transformation is done using standard techniques appropriate to such cells. The calcium treatment employing 
calcium chloride, as described in Sambrook et ai., supra , or eiectroporation is generally used for prokaryotes 

20 or other cells that contain substantial cell-wall barriers. Infection with Agrobacterium tumefaciem is used for 
transformation of certain plant cells, as described by Shaw et ai.. Gene, 22:315 (1983) and WO 89/05859 
published 29 June 1989. In addition, plants may be transfected using ultrasound treatment as described in WO 
91/00358 published 10 January 1991, 

For mammalian cells without such cell walls, the calcium phosphate precipitation method 

25 of Graham and van der Eb, Virology . 52:456-457 (1978) is preferred. General aspects of mammalian cell host 
system transfonnations have been described in U.S. Pat. No. 4,399,216. Transformations into yeast are 
typically carried out according to the method of Van Solingen et ai., J, Bact. , 130:946 (1977) and Hsiao et al., 
Proc. Natl. Acad. Sci. (USA) . 76:3829 (1979). However, other methods for introducing DNA into cells, such 
as by nuclear microinjeaion, eiectroporation, bacterial protoplast fusion with intact cells, or polycations, e.g., 

30 poiybrene, polyomiihine, may also be used. For various techniques for transforming mammalian cells, see 
Keown et al.. Methods in Enzvmology. 185:527-537 (1990) and Mansour et al.. Nature . 236:348-352 (1988). 
4. Culturing the Host Cells 

Prokaryotic cells used to produce vSmo may be cultured in suitable media as described 
generally in Sambrook et al., supra . 
35 The mammalian host ceils used to produce vSmo may be cultured in a variety of media. 

Examples of commercially available media include Ham's FIO (Sigma), Minimal Essential Medium ("MEM", 
Sigma). RPMl-1640 (Sigma), and Dulbecco's Modified Eagle's Medium ("DMEM", Sigma). Any such media 
may be supplemented as necessary with hormones and/or other growth factors (such as insulin, transferrin, or 
epidermal growth factor), salts (such as sodium chloride, calcium, magnesium, and phosphate), buffers (such 
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as HEPES), nucleosides (such as adenosine and thymidine), antibiotics (such as Gentamycin ^ drug), trace 
elements (defined as inorganic compounds usually present at final concentrations in the micromolar range), 
and glucose or an equivalent energy source. Any other necessary supplements may also be included at 
appropriate concentrations that would be known to those skilled in the art. The culture conditions, such as 
5 temperanire, pH, and the like, arc those previously used with the host cell selected for expression, and will be 
apparent to the ordinarily skilled artisan. 

In general, principles, protocols, and practical techniques for maximizing the productivity 
of mammalian cell cultures can be found in Mammalian Cell Biotechnoloizv: a Practical Approach . M. Butler, 
ed.(IRL Press, 1991). 

10 The host cells referred to in this disclosure encompass cells in culture as wcli as cells that 

are within a host animal, 

5. Detecting Gene Amplification/Expression 

Gene amplification and/or expression may be measured in a sample directly, for example, 
by conventional Southem bloning. Northern bloning to quantitate the transcription of mRNA [Thomas. Proc. 

15 Natl. Acad. Sci. USA . 77:5201-5205 (1980)], dot bloning (DNA analysis), or in situ hybridization, using an 
appropriately labeled probe, based on the sequences provided herein. Various labels may be employed, most 
commonly radioisotopes, and particularly "'"P. However, other techniques may also be employed, such as 
using biotin-modified nucleotides for introduction into a polynucleotide. The biotin then serves as the site for 
binding to avidin or antibodies, which may be labeled with a wide variety of labels, such as radionucleotides, 

20 tlucresccrs or enzymes. Altemativeiy. antibodies may be employed that can recognize specitic duplexes, 
including DNA duplexes. RNA duplexes, and DNA-RNA hybrid duplexes or DNA-protein duplexes. The 
antibodies in mm may be labeled and the assay may be carried out where the duplex is bound to a surface, so 
that upon the formation of duplex on the surface, the presence of antibody bound to the duplex can be detected. 

Gene expression, altemativeiy, may be measured by immunological methods, such as 

25 immunohistochemical staining of cells or tissue sections and assay of cell culture or body fluids, to quantitate 
directly the expression of gene product. With immunohistochemical staining techniques, a cell sample is 
prepared, typically by dehydration and fixation, followed by reaction with labeled antibodies specific for the 
gene product coupled, where the labels are usually visually detectable, such as enzymatic labels, fluorescent 
labels, or luminescent labels. 

30 Antibodies useful for immunohistochemical staining and/or assay of sample fluids may be 

either monoclonal or polyclonal, and may be prepared in any mammal. Conveniently, the antibodies may be 
prepared against a native sequence vSmo protein or against a synthetic peptide based on the DNA sequences 
provided herein. 

6. Purification of vSmo 

35 It is contemplated that it may be desired to purify some form of vSmo from recombinant cell 

proteins or polypeptides to obtain preparations that are substantially homogeneous as to vSmo. As a fu^i step, 
the culture medium or lysate may be centrifuged to remove paniculate cell debris. vSmo thereafter may be 
purified from contaminant soluble proteins and polypeptides, with tlie following procedures being exemplary 
of suitable purification procedures: by fractionation on an ion-exchange column; ethanol precipitation: reverse 
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phase HPLC; chromatography on silica or on a cation-exchange resin such as DEAE; chromatofocusmg; SDS- 
PAGE; ammonium sulfate precipitation; gel filtration using, for example. Sephadex G-75; and protein A 
Sepharose columns to remove contammants such as IgG. vSmo variants may be recovered in the same fashion 
as native sequence vSmo. taking account of any substantial changes in properties occasioned by the variation. 

A protease inhibitor such as phenyl methyl sulfonyl fluoride (PMSF) also may be useful to 
inhibit proteolytic degradation during purification, and antibiotics may be included to prevent the growth of 

adventitious contaminants. 

7. rnvalent Modifications of vSmo 

Covalent modifications of vSmo are included within the scope of this invention. One type 
of covalent modification of the vSmo included within the scope of this invention comprises altering the native 
glycosylation pattern of the protein. "Altering the native glycosylation pattern^' is intended for purposes herein 

vSmn. and/or addine one or more 
to mean deleting one or more caroonyuiaic muicutj ivunv- ... 1- 

glycosylation sites that are not present in the native sequence vSmo. 

Glycosylation of polypeptides is typically either N-linked or 0-linked. N-linked refers to 

the anachment of the carbohydrate moiety to the side chain of an asparagine residue. The tripeptidc sequences 
asparagine-X-serine and asparagine-X-threonine, where X is any amino acid except proline, are the recognition 
sequences for enzymatic attachment of the carbohydrate moiety to the asparagine side chain. Thus, the 
presence of either of these tripeptide sequences in a polypeptide creates a potential glycosylation site. 0-linked 
glycosylation refers to the anachment of one of the sugars N-aceylgalactosamine, galactose, or xylose to a 
20 hydroxylamino acid, most commonly serine or threonine, although 5-hydroxyproline or 5-hydroxy lysine may 
also be used. 

Addition of glycosylation sites to the vSmo may be accomplished by altering the amino acid 
sequence such that it contains one or more of the above-described tripeptide sequences (for N-linked 
glycosylation sites). The alteration may also be made by the addition of, or substitution by, one or more serine 
25 or threonine residues to the native sequence vSmo (for 0-linked glycosylation sites). The vSmo amino acid 
sequence may optionally be altered through changes at the DNA level, particularly by mutating the DNA 
encoding the vSmo protein at preselected bases such that codons are generated that will translate into the 
desired amino acids. The DNA mutation(s) may be made using methods described above and in U.S. PaL No. 
5,364,934, supra . 

Another means of increasing the number of carbohydrate moieties on the vSmo is by 
chemical or enzymatic coupling of glycosides to the polypeptide. Depending on the coupling mode used, the 
sugar(s) may be attached to (a) argininc and histidine, (b) free carboxyl groups, (c) free sulfhydryl groups such 
as mose of cysteine, (d) free hydroxyl groups such as those of serine, threonine, or hydroxyproline, (e) aromatic 
residues such as those of phenylalanine, tyrosine, or tryptophan, or (f) the amide group of glutamine. These 
35 methods are descnbed in WO 87/05330 published 1 1 September 1987, and in Aplin and Wriston, CRCCrit. 
Rev. Riochem.. pp. 259-306 (1981). 

Removal of carbohydrate moieties present on the vSmo protein may be accomplished 
chemically or enzv'matically or by mutational substitution of codons encoding for amino acid residues that serve 
as targets for glycosylation. For instance, chemical deglycosylation by exposing the polypeptide to the 



30 
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compound trifluoromethanesuifonic acid, or an equivalent compound can result in the cleavage of most or all 
sugars except the linking sugar (N-acetylglucosamine or N-acetyigaiactosamine), while leaving the polypeptide 
intact. Chemical degiycosyiation is described by Hakimuddin. et ah. Arch. Biochem. Biophvs.. 212:52 (1987) 
and by Edge et ai., Anal. Biochem.. 1 18:131 (1981). Enzymatic cleavage of carbohydrate moieties on 
5 polypeptides can be achieved by the use of a variety of endo- and exo-giycosidases as described by Thotakura 
et al.. Meth. Enzvmol. . 128:350 ( 1 987). 

Glycosyiation al potential glycosylation sites may be prevented by the use of the compound 
tunicamycin as described by Duskin et aL, J. Biol. Chem. . 252:3 105 (1982). Tunicamycin blocks the formation 
of protein-N-giycoside linkages. 

10 8. vSmo Chimeras 

The present invention also provides chimeric molecules comprising vSmo fused to another, 
heterologous amino acid sequence or polypeptide. In one embodiment, the chimeric molecule comprises a 
fusion of the vSmo with a tag polypeptide which provides an epitope to which an anti-tag antibody can 
selectively bind. The epitope tag is generally provided at the amino- or carboxyl- terminus of the vSmo. Such 

1 5 epitope-tagged forms of the vSmo are desirable as the presence thereof can be detected using a labeled antibody 
against the tag polypeptide. Also, provision of the epitope tag enables the vSmo to be readily purified by 
affinity purification using the anti-tag antibody. Affinity purification techniques and diagnostic assays 
involving antibodies are described later herein. 

Tag polypeptides and their respective antibodies are well known in the an. Examples include 

20 the flu HA tag polypeptide and its antibody 12CA5 [Field et al., Mol. Cell. Biol. . 8:2159-2165 (1988)]; the c- 
myc tag and the 8F9. 3C7. 6E10. 04, B7 and 9E10 antibodies thereto [Evan et al.. Vtolecular and Cellular 
Bioiogy . 5:3610-3616 (1985)1; and the Herpes Simplex virus glycoprotein D (gD) tag and its antibody 
[Paborsky et al.. Protein Engineerine . 3(6):547-553 (1990)]. Other tag polypeptides have been disclosed. 
Examples include the Flag-peptide [Hopp et al., BioTechnology . 6:1204-1210 (1988)J; the K.T3 epitope 

25 peptide [Martin ct al.. Science . 255:192-194 (1992)]; an a-tubulin epitope peptide [Skinner et al.. J. Biol. 
Chem. . 266 : 1 5 1 63- 1 5 1 66 ( 1 99 1 )]; and the T7 gene 1 0 protein peptide tag [Lutz-Freyermuth et al., Proc. Natl. 
Acad. Sci. USA . 87:6393-6397 (1990)]. Once the tag polypeptide has been selected, an antibody thereto can 
be generated using the techniques disclosed herein. 

The general methods suitable for the construction and production of epitope-tagged vSmo 

30 are the same as those disclosed hereinabove. vSmo-tag polypeptide fusions are most conveniently constructed 
by fusing the cDNA sequence encoding the vSmo portion in-frame to the tag polypeptide DNA sequence and 
expressing the resultant DNA fusion construct in appropriate host cells. Ordinarily, when preparing the vSmo- 
tag polypeptide chimeras of the present invention, nucleic acid encoding the vSmo will be fused at its 3' end 
to nucleic acid encoding the N-terminus of the tag polypeptide, however 5' fusions are also possible. 

35 9. Methods of Using vSmo 

vSmo, as disclosed in the present specification, has utility in therapeutic and non-therapeutic 
applications. As a therapeutic. vSmo (or the nucleic acid encoding the same) can be employed in in vivo or 
ex vivo gene therapy techniques. In non-iherapeutic applications, nucleic acid sequences encoding the vSmo 
may be used as a diagnostic for tissue-specific typing. For example, procedures like in situ hybridization. 
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Nonhen, and Southerr blotting, and PGR analysis may be used to determine whether DNA and/or RNA 
encoding vSmo is present in the cell t>'pe(s) being evaluated. vSmo nucleic acid will also be useful for the 
preparation of vSmo by the recombinant techniques described herein. 

The isolated vSmo may be used in quantitative diagnostic assays as a control against which 
samples containing unknown quantities of vSmo may be prepared. vSmo preparations arc also useful in 
generating antibodies, as standards in assays for vSmo (e.g., by labeling vSmo for use as a standard in a 
radioimmunoassay, radioreceptor assay, or enzyme-linked immunoassay), and in affinity purification 
techniques. 

Nucleic acids which encode vSmo. such as the rat vSmo disclosed herein, can also be used 
to generate either transgenic animals or "knock out" animals which, in turn, are useful in the developmem and 
screening of therapeutically useful reagents. A transgenic animal (e.g.. a mouse or rat) is an animal having cells 
.u : ,.,i,i/-h froncopnp was inffoduced into the animai or an ancestor of the animal at a 

lUai ^ciiiaiii a utitiJ5wiik», iTii»~.. 

prenatal, e.g.. an embryonic stage. A transgene is a DNA which is integrated into the genome of a cell from 
which a transgenic animal develops. In one embodiment, rat cDNA encoding rSmo or an appropriate sequence 
1 5 thereof can be used to clone genomic DNA encoding Smo in accordance with established techniques and the 
genomic sequences used to generate transgenic animals that contain cells which express DNA encoding Smo. 
Methods for generating transgenic animals, paiticularly animals such as mice or rats, have become conventional 
in the an and'arc described, for example, in U.S. Patent Nos. 4,736,866 and 4,870.009. Typically, particular 
ceils would be targeted for vSmo transgene incoiporation with tissue-specific enhancers. Transgenic animals 
20 that include a copy of a transgene encoding vSmo inffoduced into the germ line of the animal at an embryonic 
stage can be used to examine the effect of increased expression of DNA encoding vSmo. Such animals can 
be used as tester animals for reagents thought to confer protection from, for example, pathological conditions 
associated with constitutive activity of vSmo or Hedgehog, including some forms of cancer that may result 
therefrom, such as for example, basal cell carcinoma, basal cell nevus syndrome and pancreatic carcinoma. 
25 In accordance with this facet of the invention, an animal is treated with the reagent and a reduced incidence 
of the pathological condition, compared to untreated animals bearing the transgene, would indicate a potential 
therapeutic intervention for the pathological condition. 

Alternatively, the non-human homologues of vSmo can be used to construct a vSmo "knock 
out" animal which has a defective or altered gene encoding vSmo as a result of homologous recombination 
30 between the endogenous gene encoding vSmo and altered genomic DNA encoding vSmo introduced into an 
embryonic cell of the animal. For example, rat cDNA encoding Smo can be used to clone genomic DNA 
encoding Smo in accordance with established techniques. A portion of the genomic DNA encoding Smo can 
be deleted or replaced with another gene, such as a gene encoding a selectable marker which can be used to 
monitor integration. Typically, several kilobases of unaltered flanking DNA (both at the 5' and 3' ends) are 
35 included in the vector [see e.g., Thomas and Capecchi, CM. li:503 (1987) for a description of homologous 
recombination vectors]. The vector is introduced into an embryonic stem cell line (e.g., by electroporation) 
and ceils in which the introduced DNA has homologously recombined with the endogenous DNA are selected 
[see e.g., Li et al., £eli. 62:9 1 5 (1 992)]. The selected cells are then injected into a blastocyst of an animal (e.g., 
a mouse or rat) to form aggregation chimeras (see e.g.. Bradley, in Teraiocarcinomas and Embryonic Stem 
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Cells: A Practical Approach, E. J. Robertson, ed. (IRL, Oxford, 1987), pp. 1 13-152]. A chimeric embryo can 
then be implanted into a suitable pseudopregnant female foster animal and the embryo brought to term to create 
a "knock out" animal. Progeny harboring the homologousiy rccombined DNA in their germ ceils can be 
identified by standard techniques and used to breed animals in which all ceils of the animal contain the 
5 homoioeousiy recombined DNA. Knockout animals can be characterized for instance, for their ability to 
defend against certain pathological conditions and can be used in the study of the mechanism by which the 
Hedgehog family of molecules exerts mitogenic, differentiative, and morphogenic effects. 
B. Anti-vSmo Antibody Preparation 

The present invention further provides anti-vSmo antibodies. Antibodies against vSmo may 
10 be prepared as follows. Exemplary antibodies include polyclonal, monoclonal, humanized, bispecific, and 
heteroconjugate antibodies. 

1. Polyclonal Antibodies 

The vSmo antibodies may comprise polyclonal antibodies. Methods of preparing polyclonal 
antibodies are known to the skilled artisan. Polyclonal antibodies can be raised in a mammal, for example, by 

15 one or more injections of an immunizing agent and, if desired, an adjuvant. Typically, the immunizing agent 
and/or adjuvant will be injected in the mammal by multiple subcutaneous or intraperitoneal injections. The 
immunizing agent may include the vSmo protein or a fusion protein thereof. It may be useful to conjugate the 
immunizing agent to a protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins which may be employed include but are not limited to keyhole limpet hemocyanin, 

20 serum albumin, bovine thyroglobulin, and soybean nrypsin inhibitor. An aggregating agent such as alum may 
also be employed to enhance the mammal's immune response. Examples of adjuvants which may be employed 
include Kreund's complete adjuvant and MPL-TDM adjuvant (monophosphoryi Lipid A, synthetic trehalose 
dicorynomycolate). The immunization protocol may be selected by one skilled in the art without undue 
experimentation. The mammal can then be bled, and the serum assayed for antibody titer. If desired, the 

25 mammal can be boosted until the antibody titer increases or plateaus. 

2. Monoclonal Antibodies 

Tlie vSmo antibodies may, alternatively, be monoclonal antibodies. Monoclonal antibodies 
may be prepared using hybridoma methods, such as those described by Kohler and Milstein, supra. In a 
hybridoma method, a mouse, hamster, or other appropriate host animal, is tj-pically immunized (such as 
30 described above) with an immunizing agent to elicit lymphocytes that produce or are capable of producing 
antibodies that will specifically bind to the immunizing agent. Alternatively, the lymphocytes may be 
immunized in vitro. 

The immunizing agent will typically include the vSmo protein or a fusion protein thereof. 
Cells expressing vSmo at their surface may also be employed. Generally, either peripheral blood lymphocytes 
35 ("PBLs") are used if cells of human origin are desired, or spleen cells or lymph node cells are used if non- 
human mammalian sources are desired. 'I*he lymphocytes are then ftised with an immortalized cell line using 
a suitable tlising agent, such as polyethylene glycol, to fonm a hybridoma cell [Coding, Monoclonal Antibodies: 
Principles and Practice . Academic Press, (1986) pp. 59-103]. Immortalized ceil lines are usually transformed 
mammalian cells, particularly myeloma cells of rodent, bovine and human origin. Usually, rat or mouse 
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myeloma cell lines are employed. The hybridoma cells may be culnired in a suitable culture medium that 
preferably contains one or more substances that inhibit the growth or survival of the unfused, immonalized 
cells. For example, if the parental cells lack the enzyme hypoxanthine guanine phosphoribosyl transferase 
(HGPRT or HPRT), the culture medium for the hybridomas typically will include hypoxanthine, aminopterin, 

5 and thymidine ("HAT medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immonalized cell lines are those that ftise efficiently, support stable high level 
expression of antibody by the selected amibody-producing cells, and are sensitive to a medium such as HAT 
medium. More preferred immortalized ceil lines arc murine myeloma lines, which can be obtained, for 
instance, trom the Salk Institute Cell Distribution Center, San Diego, California and the American Type Culture 

1 0 Collection, Rockville. Maryland. Human myeloma and mouse-human heteromyeloma cell lines also have been 
described forthe production of human monoclonal antibodies [Kozbor, J. Immunol., 133:3001 (1984); Brodeur 
et al., Mr>nnc!onal Anrihoriv Production Techniques and Annlications. Marcel Dekker, Inc., New York. (1987) 
pp. 51-63]. 

The culture medium in which the hybridoma cells are culmred can then be assayed forthe 
1 5 presence of monoclonal antibodies directed against vSmo. Preferably, the binding specificity of monoclonal 
antibodies produced by the hybridoma cells is determined by immunoprecipitation or by an m vitro binding 
assay, such as radioimmunoassay (RJA) or enzyme-linked immunoabsorbent assay (ELISA). Such techniques 
and assays are known in the art. The binding affinity of the monoclonal antibody can, for example, be 
determined by the Scatchard analysis of Munson and Pollard, Anal. Biochem,. J02:220 (1980). 
20 After the desired hybridoma cells are identified, the clones may be subcloned by limiting 

dilution procedures and grown by standard methods [Coding, supra]. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. Altemaiiveiy, the 
hybridoma ceils may be grown in vivo as ascites in a mammal. 

The monoclonal antibodies secreted by the subclones may be isolated or purified from the 
25 culture medium or ascites fiuid by convemional immunogiobuiin purification procedures such as, for example, 
protein A-Sepharose, hydroxy lapatite chromatography, gel electrophoresis, dialysis, or affinity 
chromatography. 

The monoclonal antibodies may also be made by recombinant DNA methods, such as those 
described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the invention can be 

30 readily isolated and sequenced using conventional procedures (e.g., by using oligonucleotide probes that are 
capable of binding specifically to genes encoding the heavy and light chains of murine antibodies). The 
hybridoma cells of the invention serve as a preferred source of such DNA. Once isolated, the DNA may be 
placed into expression vectors, which are then transfected into host cells such as simian COS cells, Chinese 
hamster ovary (CHO) cells, or myeloma cells that do not otherwise produce immunoglobulin protein, to obtain 

35 the synthesis of monoclonal antibodies in the recombinant host cells. Tlie DNA also may be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains in place of 
the homologous murine sequences [U.S. Patem No. 4.816,567; Morrison et al.. ^UEia] or by covalently joining 
to the immunoglobulin coding sequence all or part of the coding sequence for a non-immunogiobulin 
polypeptide. Such a non-immunoglobulin polypeptide can be substimted for the constant domains of an 
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antibody of the invention, or can be substituted for the variable domains of one antigen-combining site of an 
antibody of the invention to create a chimeric bivalent antibody. 

The antibodies may be monovalent antibodies. Methods for preparing monovalent antibodies 
are well known in the art. For example, one method involves recombinant expression of immunoglobulin light 
5 chain and modified heavy chain. The heavy chain is truncated generally at any point in the Fc region so as to 
prevent heavy chain crosslinking. Alternatively, the relevant cysteine residues are substituted with another 
amino acid residue or are deleted so as to prevent crosslinking. 

/n vitro methods are also suitable for preparing monovalent antibodies. Digestion of 
antibodies to produce fragments thereof, particularly. Fab fragments, can be accomplished using routine 

10 techniques known in the art. For instance, digestion can be performed using papain. Examples of papain 
digestion are described in WO 94/29348 published 12/22/94 and U.S. Patent No. 4,342,566. Papain digestion 
of antibodies typically produces two identical antigen binding fragments, called Fab fragments, each with a 
single antigen binding site, and a residual Fc fragment. Pepsin treatment yields an Ffab')^ fragment that has 
two antigen combining sites and is still capable of cross-linking antigen. 

1 5 The Fab fragments produced in the antibody digestion also contain the constant domains of 

the light chain and the first constant domain (CH|) of the heavy chain. Fab' fragments differ from Fab 
fragments by the addition of a few residues at the carboxy terminus of the heavy chain CH j domain including 
one or more cysteines from ihc antibody hinge region. Fab'-SH is the designation herein for Fab' in which the 
cysteine rcsiduc(s) of the constant domains bear a free thiol group. F(ab')7 antibody fragments originally were 

20 produced as pairs of Fab' fragments which have hinge cysteines between them. Other chemical couplings of 
antibody fragments are also known. 

3. Humanized Antibodies 

The vSmo antibodies of the invention may further comprise humanized antibodies or human 
antibodies. Humanized forms of non-human (e.g., murine) antibodies are chimeric immunoglobulins, 

25 immunoglobulin chains or fragments thereof (such as Fv. Fab, Fab', F(ab')2 or other antigen-binding 
subsequences of antibodies) which contain minimal sequence derived from non-human immunoglobulin. 
Humanized antibodies include human immunoglobulins (recipient antibody) in which residues from a 
complementary determining region (CDR) of the recipient are replaced by residues from a CDR of a non- 
human species (donor antibody) such as mouse, rat or rabbit having the desired specificity, affinity and 

30 capacity. In some instances, Fv framework residues of the human immunoglobulin are replaced by 
corresponding non-human residues. Humanized antibodies may also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the humanized 
antibody will comprise substantially all of at least one, and typically two, variable domains, in which all or 
substantially all of the CDR regions correspond to those of a non-human immunoglobulin and all or 

35 substantially all of the FR regions are those of a human immunoglobulin consensus sequence. The humanized 
antibody optimally also will comprise at least a portion of an immunoglobulin constant region (Fc), typically 
that of a human immunoglobulin [Jones etaL, Nature, 221-522-525 (1986); Reichmann et al.. Nature . 322:323- 
329 (1988): and Presta, Curr. Op. Struct. Biol. . 2:593-596 (1992)]. 
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Methods for humanizing non-human antibodies are well known in the art. Generally, a 
humanized antibody has one or more amino acid residues introduced into it from a source which is non-human. 
Tliese non-human amino acid residues are often referred to as "import" residues, which are typicaily taken from 
an "import" variable domain. Humanization can be essentially performed following the method of Winter and 

5 co-workers [Jones et al., Nature . 321:522-525 (1986); Riechmann et aL, Nature, 132:323-327 (1988); 
Verhoeyen et al.. Science . 239:1534-1536 (1988)], by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. Accordingly, such "humanized" antibodies are chimeric 
antibodies (U.S. Patent No. 4,816,567), wherein substantially less than an intact human variable domain has 
been substituted by the corresponding sequence from a non-human species. In practice, humanized antibodies 

1 0 are typically human antibodies in which some CDR residues and possibly some FR residues are substituted by 
residues from analogous sites in rodent antibodies. 

The choice of human variable domains, both light and heavy, to be used in making the 
humanized antibodies is very important in order to reduce antigenicity. According to the "best-fit" method, 
the sequence of the variable domain of a rodent antibody is screened against the entire library of known human 

1 5 variable domain sequences. The human sequence which is closest to that of the rodent is then accepted as the 
human framework (FR) for the humanized antibody [Sims et al., J. Immunol. . 151:2296 (1993); Chothia and 
Lesk, J. Mol. Biol. . 196:90 1 ( 1 987)]. Another method uses a particular framework derived from the consensus 
sequence of all human antibodies of a particular subgroup of light or heavy chains. The same framework may 
be used for several different humanized antibodies [Carter et al., Proc. Natl. A cad. Sci. USA. 89:4285 (1992); 

20 Presia et ai., J. Immunol. . 111:2623 ( 1993)]. 

It is funher important that antibodies be humanized with retention of high affinity for the 
antigen and other favorable biological propenies. To achieve this goal, according to a preferred method, 
humanized antibodies are prepared by a process of analysis of the parental sequences and various conceptual 
humanized products using three dimensional models of the parental and humanized sequences. Three 

25 dimensional immunoglobulin models are commonly available and are familiar to those skilled in the art. 
Computer programs are available which illustrate and display probable three-dimensional conformational 
structures of selected candidate immunoglobulin sequences. Inspection of these displays permits analysis of 
the likely role of the residues in the functioning of the candidate immunoglobulin sequence, i.e.. the analysis 
of residues that influence the ability of the candidate immunoglobulin to bind its antigen. In this way, FR 

30 residues can be selected and combined from the consensus and import sequence so that the desired antibody 
characteristic, such as increased affinity for the target antigen(s), is achieved. In general, the CDR residues are 
directly and most substantially involved in influencing antigen binding [see, WO 94/04679 published 3 March 
1994]. 

Transgenic animals (e.g., mice) that are capable, upon immunization, of producing a full 
35 repertoire of human antibodies in the absence of endogenous immunoglobulin production can be employed. 
For example, it has been described that the homozygous deletion of the antibody heavy chain joining region 
(J^) gene in chimeric and germ-line mutant mice results in complete inhibition of endogenous antibody 
production. Transfer of the human germ-line immunoglobulin gene array in such germ- line mutant mice will 
result in the production of human antibodies upon antigen challenge (see, e.g., Jakobovits et al., Proc. Natl. 
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Acad. Sci. USA . 90:255 1 -255 ( 1 993): Jakobovits et al., Nature . 262:255-258 (1993); Bnjggermann el al., Yeai 
in immuno. . 7:33 (1993)]. Human antibodies can aiso be produced in phage display libraries (Hoogenboom 
and Winter. L_MoLBioL, 227:381 (1991): Marks et aL JJMJBkL, 222:581 (1991)]. The techniques of 
Cote et al. and Boemer et ai. are also available for the preparation of human monoclonal antibodies (Cote et 
5 al.. Monoclonal Antibodies and Cancer Therapy . Aian R. Liss. p. 77 ( 1 985) and Boemer et aL, J. Immunol. , 
147111:86-95 (1991)]. 

4. Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that have 
binding specificities for at least two different antigens, in the present case, one of the binding specificities is 
10 for the vSmo. the other one is for any other antigen, and preferably for a cell-surface protein or receptor or 
receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the recombinant 
production of bispecific antibodies is based on the co-expression of two immunoglobulin heavy-chain/light- 
chain pairs, where the two heavy chains have different specificities [Millstein and Cuello, Nature, 105:537-539 

15 (1983)]. Because of the random assortment of immunoglobulin heavy and light chains, these hybridomas 
(quadromas) produce a potential mixmre often different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecuie is usually accomplished by affinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 1993. and in 
Trauneci^eret al., EMBOJ. . 1^:3655-3659 (1991). 

20 According to a different and more preferred approach, antibody variable domains with the 

desired binding specificities (antibody-antigen combining sites) are fused to immunoglobulin constant domain 
sequences. The fusion preferably is with an immunoglobulin heavy-chain constant domain, comprising at least 
part of the hinge. CH2. and CH3 regions. It is preferred to have the first heavy-chain constant region (CHI) 
containing the site necessary for light-chain binding present in at least one of the fusions. DNAs encoding the 

25 immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin light chain, are inserted into separate 
expression vectors, and are co-nransfected into a suitable host organism. This provides for great flexibility in 
adjusting the mutual proportions of the three polypeptide fragments in embodiments when unequal ratios of 
the three polypeptide chains used in the construction provide the optimum yields. It is, however, possible to 
insert the coding sequences for two or all three polypeptide chains in one expression vector when the 

30 expression of at least two polypeptide chains in equal ratios results in high yields or when the ratios are of no 
particular significance. In a preferred embodiment of this approach, the bispecific antibodies are composed 
of a hybrid immunoglobulin heavy chain with a first binding specificity in one arm, and a hybrid 
immunoglobulin heavy-chaini'light-chain pair (providing a second binding specificity) in the other arm. It was 
found that this asymmetric structure facilitates the separation of the desired bispecific compound from 

35 unwanted immunoglobulin chain combinations, as the presence of an immunoglobulin light chain in only one 
half of the bispecific molecule provides for a facile way of separation. This approach is disclosed in WO 
94/04690 published 3 March 1994. For further details of generating bispecific antibodies see, for example, 
Suresh et al.. Methods in Enzvmology. 121:2 10(1 986). 
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5. |-lpternconiu °atff Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies have, for 
example, been proposed to target immune system cells to unwanted cells [US Patent No. 4.676.980]. and for 
. treamient of HIV infection [WO 91/00360; WO 92/200373: EP 03089]. It is contemplated that the am.bodies 
may be prepared in vi.ro using known methods in synthetic protein chemistry, including those involving 
crosslinking ^enK For example, immunotoxins may be constructed using a disulfide exchange reaction or 
by forming a thioether bond. Examples of suitable reagents for this purpose include iminothiolate and methyl- 
4-mercaptobutyrimidate and those disclosed, for example, in U.S. Pat. No. 4,676,980. 
JO 6. Ikesof vSrn n Antibodies 

vSmo antibodies may be used in diagnostic assays for vSmo, e.g., detecting its expression 
in soecific cells or ttssues. Various diagnostic assay techniques known in the art may be used, such as 
competitive binding assays, direct or indirect sandwich assays and immunoprecipitation assays conducted in 
either heteroeeneous or homogeneous phases [Zola, Monorlon.l Antibodies- A Mnnnai of Techniques . CRC 
15 Press. Inc. (1987) pp. 147-158]. The antibodies used in the diagnostic assays can be labeled with a detectable 
motetv TTte detectable moiety should be capable of producing, either directly or indirectly, a detectable signal. 
For example, the detectable moiery may be a radioisotope, such as ^H. ""c, 32p, ^Sg, or ^^h. a fluorescent 
or chemiiuminescentcompound. such as fluorescein isothiocyanate.rhodamine,orluciferin. or an enzj-mcsuch 

as alkaline phosphatase, beta-galactosidase or horseradish peroxidase. Any method known in the art for 
.0 conju^atine the antibodv to the detectable moiety may be employed, including those methods described by 
Hunte'r et aL » 144:945 ( 1 962); David et al., lifldMrv. 12: 1 0 14 (1974); Pain et al.. LimmunoL 
Meth.. fi0:219 (1981 ): and Nygren, I Hi.;rnchem. and Cvtochem. . 30:407 (1982). 

vSmo antibodies also are useful for the affinity detection or purification of vSmo from 
recombinant cell culture or natural sources. In this process, the antibodies agamst vSmo ar^ immobilized on 
^5 a suitable support such a Sephadex resin or filter paper, using methods well known in the art. The immob.hzed 
antibodv then is contacted with a sample containing the vSmo, and thereafter the support is washed with a 
suitable solvent that will remove substantially all the material in the sample except the vSmo, which is bound 
to the immobilized antibody. Finally, the suppon is washed with another suitable solvent that will release the 
vSmo from the antibody. 

30 The vSmo antibodies may also be employed as therapeutics. For example, vSmo amibodies 

mav be used to block or neutralize excess vSmo signalling that may result from mutant or inactivated Patched. 
Accordingly, the vSmo antibodies may be used in the treatment of, or amelioration of symptoms caused by, 
a pathological condition resulting from or associated with excess vSmo or vSmo signalling. Optionally, 
aaonistic vSmo antibodies can be employed to induce the formation of, or enhance or stimulate tissue 

35 reoeneration, such as regeneration of skin tissue, lung tissue, muscle (such as heart or skeletal muscle), neural 
tissue (such as serotonergic neurons, motoneurons or straital neurons), bone tissue or gut tissue. This vSmo 
antibodv therapy will be useful in instances where the tissue has been damaged by disease, aging or trauma. 
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The vSmo antibodies may be used or administered to a patient in a pharmaceuticaily- 
accepiable carrier. Suitable carriers and their formulations are described in RemingtQn'^ Pharmaceupcal 
Sciences . 16th ed.. 1980, Mack Publishing Co.. edited by Oslo et aL If the vSmo antibodies are to be 
administered to a patient, the antibodies can be administered by injection (e.g., intravenous, intraperitoneal, 
5 subcutaneous, intramuscular), or by other methods such as infusion that ensure its delivery to the bloodstream 
in an effective form. Effective dosages and schedules for administering the vSmo antibodies may be 
determined empirically, and making such determinations is within the skill in the art. Those skilled in the art 
will understand that the dosage of vSmo antibodies that must be administered will vary depending on, for 
example, the patient which will receive the antibodies, the route of administration, and other therapeutic agents 

10 being administered to the mammal. Guidance in selecting appropriate doses for such vSmo antibodies is found 
in the literature on therapeutic uses of antibodies, e.g., Handbook o f Monoclonal Antibodies. Ferrone ei al., 
eds,. Noges Publications, Park Ridge, N.J., (1985) ch. 22 and pp. 303-357; Smith et al., Antibodies in Human 
Diagnosis and Therapy . Haber et al., eds.. Raven Press, New York (1977) pp. 365-389, A typical daily dosage 
of the vSmo antibodies used alone might range from about I to up to 100 mg/kg of body weight or more 

1 5 per day, depending on the factors mentioned above. 

C. Kits Containing vSmo or vSmo Antibodies 

In another embodiment of the invention, there are provided articles of manufacture and kits 

containing vSmo or vSmo antibodies. The article of manufacture t> pically comprises a container with a label. 

Suitable containers include, for example, bottles, vials, and test tubes. The containers may be formed from 
20 a variet\' of materials such as glass or plastic. The container holds the vSmo or vSmo antibodies. The label 

on the container may indicate directions for either in vivo or in vitro use, such as those described above. 

The kit of the invention will lypically comprise the container described above and one or 

more other containers comprising materials desirable from a commercial and user standpoint, including buffers, 

diluents, filters, and package inserts with instructions for use. 
25 D. Additional Compositions of Maner 

In a further embodiment of the invention, there are provided protein complexes comprising 

venebrate Smoothened protein and vertebrate Patched protein. As demonstrated in the Examples, vertebrate 

Smoothened and vertebrate Patched can form a complex. The protein complex which includes vertebrate 

Smoothened and vertebrate Patched may also include venebrate Hedgehog protein. Typically in such a 
30 complex, the vertebrate Hedgehog binds to the vertebrate Patched but does not bind to the vertebrate 

Smoothened. In a preferred embodiment, the complex comprising vertebrate Smoothened and vertebrate 

Patched is a receptor for venebrate Hedgehog. 

The invention also provides a venebrate Patched which binds to vertebrate Smoothened. 

Optionally the venebrate Patched comprises a sequence which is a derivative of or fragment of a native 
35 sequence venebrate Patched. The venebrate Patched will typically consist of a sequence which has less than 

100% sequence identity with a native sequence vertebrate Patched. In one embodiment, the vertebrate Patched 

directly and specifically binds venebrate Smoothened. Alternatively, it is contemplated that the vertebrate 

Patched may bind venebrate Smoothened indirectly. 



wo 98/14475 PCT/US97/17433 



The following examples are offered for illustrative purposes only, and are not intended to 

iimii the scope of the present invention in any way. 

All references cited in the present specification are hereby incorporated by reference in their 

entirety. 

5 EXAMPLES 

All commercially available reagents referred to in the examples were used according to 
manufacturer's instructions unless otherxvise indicated. The source of those cells identified in the following 
examples, and throughout the specification, by ATCC accession numbers is the American Type Culture 
Collection, Rockville, Maryland. 
10 FXAMPLE 1 

kolation and Cloning of R at Smoothened cDNA 

fuu-iengm rai oinouuiciicu ^L/i'<rk waj ij«iui.w^ .v," - — --j ^ 

of 1.2 X 10^ plaques of an embryonic day 9-10 rat cDNA library (containing cDNAs size-selected >I500 base 
pairs), using the entire coding region of Drosophila Smoothened [Alcedo et. aL, sum] (labeled with ^^P- 
15 dCTP) as a probe. The library was prepared by cloning cDNA inserts into the Not! site of a lambda RK18 
vector [Klein el. aL, Pmr Nnrl. Acad. Sci. . 22:7108-71 13 (1996)] following XmnI adapters ligation. 
Conditions for hybridization were: 5 x SSC, 30% formamide. 5 x Denhardfs, 50 mM sodium phosphate (pH 
6.5), 5% dextran sulfate, 0. 1% SDS and 50 ng/ml salmon sperm DNA, overnight at 42^C. Nitrocellulose filters 
were washed to a stringency of 1 x SSC at 42*^C, and exposed overnight to Kodak X-AR film. Three of eight 
20 positive plaques were selected for further purification. After amplification of the plaque-purified phage, 
phagemid excision products were generated by growing M13 helper phage (M13K07; obtained from New 
England Biolabs), bacteria (BB4; obtained from Stratagenej, and the purified phage together in a 100:10:1 
ratio. Plasmid DNA was recovered by Qiagen purification from ampiciilin-resistant colonies following 
infection of BB4 with the excised purified phagemid. 
25 Sequencing of the three cDN As showed them to be idcnticaL with the exception that two 

contained only a partial coding sequence, whereas the third contained the entire open reading frame of rat 
Smoothened. including 449 and 1022 nucleotides, respectively of 5' and 3' untranslated sequence and a poly-A 
tail. This cDNA clone was sequenced completely on both strands. 

The entire nucleotide sequence of rat Smoothened (rSmo) is shown in Figure i fSEQ ID 
30 NO:!) (reference is also made to Applicants' ATCC deposit of the rat Smoothened in pRK5.rsmo.AR 140, 
assigned ATCC Dep. No. 98165). The cDNA contained an open reading frame with a translational initiation 
site assigned to the ATG codon at nucleotide positions 450-452. The open reading frame ends at the 
termination codon at nucleotide positions 2829-283 1 . 

The predicted amino acid sequence of the rat Smoothened (rSmo) contains 793 amino acids 
3 5 (including a 32 amino acid signal peptide), as shown in Figure 1 (SEQ ID N0:2). rSmo appears to be a typical 
seven transmembrane (7 TM), G protein-coupled receptor, containing 4 potential N-giycosylation sites and a 
203 amino acid long putative extracellular amino-terminus domain which contains 13 stereotypically spaced 
cysteines (see Fig. 2). 
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An alignment of the rSmo sequence with sequences for dSmo, wingless receptor and 
venebraie Frizzled revealed that rSmo is 33% homologous to the dSmo sequence reported in Alcedo et aL, 
supra (50% homologous in the transmembrane domains): 23% homologous to the wingless receptor sequence 
reponed in Bhanot etal, supra : and 25% homologous to the vertebrate Frizzled sequence reported in Chan et 
5 al., supra . 

EMMPLE ? 
!n Situ Hybridization and Northern Blot Analvsis 

In situ hybridization and Nonhem blot analyses were conducted to examine tissue 
distribution of Smo, Patched and SHH in embryonic and adult rat tissues. 

10 For in situ hybridization, E9-E15.5 rat embryos (Hollister Labs) were immersion-tlxed 

overnight at 4X in 4% paraformaldehyde, then cryoprotected overnight in 20% sucrose. Adult rat brains and 
spinal cords were frozen fresh. All tissues were sectioned at 16 urn. and processed for in situ hybridization 
using ^^P-UTP labelled RNA probes as described in Treanor et al.. Nature . 182:80-83 (1996). Sense and 
antisense probes were derived from the N-terminal region of rSmo using T7 polymerase. The probe used to 

1 5 detect SHH was antisense to bases 604-1314 of mouse SHH [Echelard et aL, CeM, 75: 1417-1430 (1993)]. The 
probe used to detect Patched was antisense to bases 502-1236 of mouse Patched [Goodrich et al.. smaa]. 
Reverse transcriptase polymerase chain reaction analysis was performed as described in Treanor et al., supra. 

For Northern blot analysis, a rat multiple tissue Northern blot (Clontech) was hybridized and 
washed at high stringency according to the manufacturer's protocol, using a ■'-P-dCTP-labeiled probe 

20 encompassing the entire rSmo coding region. 

The results are illustrated in Figure 3. By in situ hybridization and Northern blot analysis, 
expression of rSmo mRNA was detected from E9 onward in SHH responsive tissues such as the neural folds 
and early neural mbe [Echelard et al., supra . Krauss et al., supra ): Roelink et al., sunra l pre-somitic mesoderm 
and somites (Johnson et al., supra : Fan et al., supra ], and developing limb buds [Riddle et al.. supra ] gut 

25 (Roberts et al., supra ) and eye [Krauss et al., supra ]. Rat Smo transcripts were also found in tissues whose 
development is regulated by other members of the vertebrate HH protein family such as testes fdesert HH) 
[Biigoodet al., Curr. Biol. . 6:298-304 (1996)], canilage (Indian HH) [Vortkamp et al., Science . 273:613-622 
( 1 996)], and muscle (the zebra fish, echinida HH) [Currie and Ingham, Nature. 382:452-455 ( 1 996)] (See e.g., 
Fig. 3; other data not shown). In all of the above recited tissues, rSmo appeared to be co-expressed with 

30 rPatched. 

rSmo and rPatched mRNAs were also found in and around SHH expressing cells in the 
embryonic lung, epiglottis, thymus, vertebral column, tongue, jaw, taste buds and teeth (Fig. 3). In the 
embryonic nervous system, rSmo and rPatched are initially expressed throughout the neural plate: by EI2, 
however, their expression declines in lateral parts of the neural tube, and by P!, was restricted to cells in 
35 relatively close proximity to the ventricular zone (Fig. 3). In the adult rat tissues, rSmo expression was 
maintained in the brain, lung, kidney, testis, heart and spleen (data not shown). 
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F .VAMPLE 3 
.nH rinninr nf H.im^n Smoothengd cDNA 

A CDNA probe corresponding to the coding reg.on of the rat ...o. W gene (descnbed 
in Example Ubove)was labeled by the randon. Hexanuc.eot.de tnethodandused to .^^^^^^^^ 
5 hu.ane.b..oniciungcDNA.,brary(C.o„tech.IncO,n,gt.O.Dup.icater,,terswerehy^^^^^^^^ 
50o/a-o™de.5xSSC,.0xDenhardt..0.05Msodiun,phosphate(pH6.5),0r/oscd.un,py^^ 
50m»-/miofsonicatedsaln.cnspennDNA. Filters were rinsed in 2x SSC and then washed once .n0.5x SC 
0,%SDSat42»C. Hybridizing phage were plaque-puritled and the cDNA inserts were subclonea .ntopUC 

„8 (New England Biolabs). Two clones. 5 and 14. had overlapping inserts of approximately 2 and2 8 
,0 .espective,v,cover.gtheentirehu.anSn,oothe„cdcod.gse,uence(SeeFig.4,Clones5t.^ 

deposited by AppUcants with ATCC as puc.llS.hs.o. and puc.U8.hsn,o.>4, respective y. and assigned 

, -,o, „„..,:.,au, Rnth errands were sequenced by standard fluorescent 

ATCC Dep. Nos. yaioz ana yomj. 

methods on an ABI377 automated sequencer. ..^cn.rvNn-^^ 
-n,e entire nucleotide sequence of human Smoothened is shown ,n F.gure 4 (SEQ ID N0.3). 
The CDNA contained an open reading frame w.th a translational initiation s.te assigned to the ATG codon at 
nucleotide positions 13-1 5. The open reading frame ends at the tcnnination codon at nucleotide positions 

" The predicted amino acid sequence of the human Smoothened (hSmo) contains 787 amino 

.cids (includine a 29 amino acid signa. peptide), as shown in Figure 4 (SEQ ID N0:4). hSmo appears to be 
,0 at^^.ca.seven;ansmembrane(7T^).Gprote,n.oupledrecepto^.containmg5pote^tialN-glyccsy.ations.t. 

and a 202 amino acid long putative extracellular amino-terminus domain which contains 13 stereotyptcally 

Spaced cysteines. r- u 

An alignment of the predicted hSmo amino acid sequence and rSmo sequence (see Example 
, , .vealed 94% amino acid identity. An alignmemofthehSmo sequence with sequences for 

,5 dSmo.winelessreceptorandvenebra.eFnzzledrevealedthathSmois33%homologoustothedSmosequence 

.ported in'Alcedo et aU ^ (SOVo homologous in the transmembrane domains); 23% homologous to the 
wineless receptor sequence repotted in Bhanot et al.. and 25% homologous to the vertebrate Fnzzled 
sequence reported .n Chan et al., See Figure 5 for a comparison of the primary sequences of human 

Smo. rat Smo and Drosophila Smo. 

FY AMPLE 4 
jU 

Competitive binding. Co-immunoprecipitation, 

and C ''"<^g-'^"*^i"g Assays 

competitive binding. co-immunoprecipita,ion and cross-linking assays were conducted to 
characterize physical association orbindhig between 5KH and rSmo. and between certain biologically act.ve 
35 forms of SHH and cells expressing rSmo, mPatched, or both rSmo and mPatched. 
1. Materiais and Methods 

Complementary DNAs for rSmo (described in Example 1); dSmo (described in Alccdo et 
al Desert HH (descnbed in Echelard ct al., and murine Patched (described in Goodrich et al.. 

were cioned into pRK5 vectors, and epitope tags [Flag epitope tag (KodaWlBl) and Myc epitope tag 
40 (9E10 epitope: inVitroeen)) added to the extreme C-terminus by PCR-based mutagenesis. 



-28- 



wo 98/14475 



PCT/US97/17433 



SHH-N is the biologically active amino terminus portion of SHH (Lcc et ai.. Science , 
266:1528-1537 (1994)]. SHH-N was produced as described by Hynes et al., supra. A radioiabeied form of 
SHH-N. *^^ISHH-N. was employed. 

For igG-SHH-N production, human embryonic kidney 293 cells were transiently transfected 
5 with the expression vector encoding SHH-N fused in frame after amino acid residue 1 98 to the Fc ponion of 
human IgG-gammal. 

Cells were maintained in serum-free media (OptiMEM: Gibco BRL) for 48 hours. The media 
was then collected and concentrated 10-fold using a centricon-lO membrane. Conditioned media was used at 
a concentration of 2x. 

1 0 Binding assays were conducted to test binding between cells expressing rSmo or dSmo and 

(1) epitope tagged SHH-N, (2) an IgG-SHH-N chimera, and (3) an epitope tagged Desert HH. 

For visualization of SHH binding, COS-7 ceils (Geneniech. Inc.) transiently expressing rSmo or 
mPatched fmurine Patched) were exposed to epitope tagged SHH-N (2 hours at 4°C), washed 4 times with 
PBS. then fixed and stained with a cy3-conjugated anti-human IgG (Jackson immunoResearch) (for IgG-SHH- 

15 N) or anti-Flag M2 antibody (Kodak/fBl) (for Flag-tagged SHH-N), 

For immunohistochemistry, COS-7 cells transiently transfected with expression constructs 
were fixed (10 minutes in 2% paraformaldehyde-'0.2% Triton-X 100) and stained using monoclonal anti-Flag 
M2 antibody (IBI) or anti-Myc antibody (InVitrogcn), followed by cyj -conjugated anti-mouse IgG (Jackson 
Immunoresearchl 

20 For cross-iinking, cells were resuspended at a density of 1-2 \ lO^/ml in ice-coIdL15 media 

containing 0.1% BSA and 50 pM *-^I-labeled SHH (with or without a 1000-fold excess of unlabeled SHH) 
and incubated at 4°C for 2 hr. 10 mM l-ethyl-3-(3-dimethylaminopropyi) carbodimide HCl and 5 mM N- 
hydroxysulfosuccinimide (Pierce Chemical) were added to the samples and incubated at room temperamrc for 
30 minutes. The cells were then washed 3 times with 1 ml of PBS. Cells were then lyscd in lysis buffer [1% 

25 Brij-96 (Sigma), 50 mM Tris, pH 8.0, 150 mM NaCK 1 mM PMSF. 10 \iM aprotinia 10 ^M leupeptin] and 
the protein complexes were immunoprecipitated with antibodies to the epitope tags as indicated. 
Immunoprecipitated proteins were resuspended in sample buffer (80 mM Tris-HC! [pH 6.8]. 10% [v/v] 
glycerol. 1% [w/v] SDS, 0.025% Bromphenol Blue, denatured and run on 4% SDS-polyacryiamide gels, which 
were dried and exposed to film. 

30 For the equilibrium binding analysis, the ceils were processed as above, and incubated with 

50 pM ^^^1-SHH and various concentrations of cold SHH-N (Cold Ligand). The IGOR program was used 
to determine K^. 

2, Results 

The results arc shown in Figure 6. No binding of epitope tagged SHH-N, of IgG-SHH-N 
35 chimeric protein or of an epitope tagged Desert HH to cells expressing rSmo or dSmo was observed (Figures 
6a-b and data not shown). This data (and the data described below) indicated that rSmo, acting alone, would 
not likely be a receptor for SHH or Desen HH. However, it was hypothesized that rSmo is a component in a 
multi-subunit SHH receptor complex and that the ligand binding function of this receptor complex would be 
provided by another membrane protein such as Patched. 
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Binding assays were also conducted to test binding between cells expressing rSmo ormurine 
patched and (1) an epitope tagged SHH and (2) an IgG-SHH-N chimera. TTie data shows that epitope tagged 
SHH-N as well as an IgG-SHH-N chimenc protein bind specifically and reversibly to cells expressing the 
mouse Patched (mPatched) (mPatched is 33% identical to Drosophila Patched) (Figure. 6c-e). Furthermore, 
5 only mPatched could be immunoprecipitated by the igG-SHH-N protein (Fig. 6f) and antibodies to an epitope 
tagged mPatched read.lv co-immunoprec.pitated 125,.SHH-N (Fig. 6h) (antibodies to epitope tagged rSmo 
could not immunoprecipitate '^SlshH-N and the IgC-SHH-N chimera did not immunoprecipitate rSmo). 

AS shown in Fig. 6g, the cross-linking assay of 125,.SHH-N to ceils expressing rSmo or 
mPatched in the presence or absence of cold SHH-N revealed that '^Sl-SHH-N is cross-linked only to 

10 mPatched expressing cells. 

Be competitive binding assay of '^Si-SHH-N and ceils cxprtsssing mPatched or mPatched 
nlus rSmo also showed that mPatched and SHH-N had a relatively high affinity of interaction (approximate 
of 460 pM) (Fig. 61). This corresponds well to the concentrations of SHH-N which are required to elicit 
bioloeical responses in multiple systems [Fan et al., suera; Hynes et al. SMEm; Roelink et al.. sufial- No 
15 binding to cells expressing rSmo alone was observed (data not shown) and there was no increase in binding 
affinity to mPatched in the presence of rSmo. 

FXAMPLE 5 
rn-immunopreripitation Assavs 
To determine whether Patched and Smo form or interact in a physical complex, co- 
20 immunoprecipitation experiments were performed. 

1. Materials and Methods 

For the double immanohistochemistry, COS-7 cells transiently transfected with expression 
constructs were permeabilized using 0,2% Triton-x 100. The cells were fixed (10 minutes in 2% 
paraformaldehvde,'0.2% Triton-X 100) and stained using monoclonal anti-Flag M2 antibody (IBI) and rabbit 
25 polvclonal anti-Myc primary antibodies (Santa Cruz Biotech), followed by cy3-conjugated anti-mouse IgG 
(Jackson immunoresearch) and bodipy-conjugated anti-rabbit IgG secondary antibodies (Molecular Probes, 
Inc.). 

Human embryonic kidney 293 cells were transiently transfected with expression vectors for 
epitope tagged rSmo (Flag epitope) and mPatched (Myc epitope) and the resulting proteins complexes were 

30 immunoprecipitated with antibody to one of the epitopes and then analyzed on a western blot. 

For the co-immunopr«cipitation assay, lysates from 293 embryonic kidney cells transiently 
expressing Flag-tagged rSmo, Myc-tagged mPatched or a combination of the two proteins were incubated (48 
hours after transfectlon) in the presence or absence of the IgG-SHH-N chimera (1 ng/mL 30 minutes at 37<>C) 
or in the presence of '25l-SHH-N with or without an excess of cold SHH-N (2 hours at 4°C). The incubated 

35 samples were then washed 3 times with PBS, and lysed in lysis buffer (see Example 4) as described by Davis 
et al.. Science. 259:1736-1739 (1993). The cell lysates were centrifuged at 10.000 rpm for 10 minutes, and 
the soluble protein complexes were immunoprecipitated with either protein A sepharose (for the IgG-SHH-N), 
or anti-Flag or anti-Myc antibodies followed by protein A sepharose (for the epitope-tagged rSmo or mPatched. 
respectively). 
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Tne samples were heated to 100*^C for 5 minutes in denaturing SDS sample buffer (125 mM 
Tris, pH 6.8. 2% SDS. 10% glycerol. 100 mM b-mercapioethanoi. 0.05% bromphenol blue) and subjected to 
SDS-PAGE. The proteins were detected either by exposure of the dried gel to film (for *^^I-SHH-N) or by 
blotting to nitrocellulose and probing with antibodies to Flag or Myc epitopes using the ECL detection system 
5 (Amersham). 

2. Re$u !ts 

The results are illustrated in Figure 7. In cells expressing mPatched alone, or rSmo alone, 
no co-immunoprecipitated protein complexes could be detected. In contrast, in cells that expressed both 
mPatched and rSmo (Fig. 7a), rSmo was readily co-immunoprecipitated by antibodies to the epitope tagged 
10 mPatched (Fig. 7b) and mPatched was co-immunoprecipitated by antibodies to the epitope tagged rSmo (Fig. 
7c). 

The ^^^I-SHH-N was readily co-immunoprecipitated by antibodies to the epitope tagged 
rSmo or mPatched from ceils that expressed both rSmo and mPatched, but not from cells expressing rSmo 
alone (Figs. 7d and 7c). These results indicate that SHH-N, rSmo and mPatched are present in the same 
1 5 physical complex, and that a rSmo-SHH complex does not form in the absence of mPatched. Although not 
fully understood and not being bound by any particular theory, it is believed that Patched is a llgand binding 
component and vSmo is a signalling component in a muiti-subunit SHH receptor (See, Fig. 9). Patched is also 
believed to be a negative regulator of vSmo. 

EXAMPLE 6 

20 Hahn et ai., supra . Johnson et a!., supra , and Gaiiani et al., supra , report that Patched 

mutations have been associated with BCNS and sporadic basal cell carcinoma ("BCC"). These investigators 
also report that most of the Patched mutations in BCNS are truncations in which no functional protein is 
produced. It is believed that BCNS and BCC may be caused or associated with constimtive activation of vSmo, 
following its release from negative regulation by Patched. 

25 Expression levels of wild-type (native) murine Patched and a mutant Patched were examined. 

A Patched mutant was generated by site-directed mutagenesis of the wild-type mouse Patched cDNA (described 
in Example 4) and verified by sequencing. The mutant Patched contained a 3 amino acid insenion (Pro-Asn- 
lie) after amino acid residue 815 (this mutant was found in a BCNS family, see, Hahn et ai.. supra) . For 
analysis of protein expression, equal amounts of pRX5 expression vectors containing wild-type or mutant 

30 Patched were transfected into 293 cells, and an equal number of cells (2 x 1 0^) were lysed per sample. Proteins 
were immunoprecipiiated from cell lysates by antibody to the Patched epitope tag (myc) and detected on a 
Western blot with the same antibody. 

Applicants found that expression of the mutant Patched (which retains a complete open 
reading frame) was reduced at least 1 0-fold as compared to its wild-type counterpart. See Fig. S. 

35 ***** 
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nepnsit of Material 

The following materials have been deposited with the American Type Culture Collection, 
12301 Parklawn Drive, Rockville, MD, USA (ATCC): 

Material ATCC Dep. No. Deposit Date 

5 puc.n8.hsmo.5 98162 Sept. 6, 1996 

puc.ll8.hsmo,14 98163 Sept. 6, 1996 

pRK5.rsmo.AR140 98165 Sept. 10, 1996 

This deposit was made under the provisions of the Budapest Treaty on the International 
Recognition of the Deposit of Microorganisms for the Purpose of Patent Procedure and the Regulations 
1 0 thereunder (Budapest Treaty). This assures maintenance of a viable culture of the deposit for 30 years from 
the date of deposit. The deposit will be made available by ATCC under the icrms uf the Budapest Treary. and 
subject to an agreement between Genentech, Inc. and ATCC, which assures permanent and unrestricted 
availability of the progeny of the culture of the deposit to the public upon issuance of the pertinent U.S. patent 
or upon laying open to the public of any U.S. or foreign patent application, whichever comes first, and assures 
1 5 availability of the progeny to one determined by the U.S. Commissioner of Patents and Trademarlcs to be 
entitled thereto according to 35 USC §122 and the Commissioner's rules pursuant thereto (including 37 CFR 
§1.14 with particular reference to 886 OG 638). 

The assignee of the present application has agreed that if a culture of the materials on deposit 
should die or be lost or destroyed when cultivated under suitable conditions, the materials will be promptly 
20 replaced on notification with another of the same. Availability of the deposited material is not to be construed 
as a license to practice the invention in contravention of the rights granted under the authority of any 
government in accordance v^iih its patent laws. 

The foregoing wrinen specification is considered to be sufficient to enable one skilled in the 
art to practice the invemion. The present invemion is not to be limited in scope by the construct deposited, 
25 since the deposited embodiment is intended as a single illustration of certain aspects of the invemion and any 
constructs that are functionally equivalent are within the scope of this invention. The deposit of material herein 
does not constitute an admission that the written description herein comained is inadequate to enable the 
practice of any aspect of the invention, including the best mode thereof, nor is it to be construed as limiting the 
scope of the claims to the specific illustrations that it represents. Indeed, various modifications of the invention 
30 in addition to those shown and described herein will become apparent to those skilled in the art from the 
foregoing description and fall within the scope of the appended claims. 
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SEQUENCE LISTING 
(1) GENERAL INFORMATION: 

(i) APPLICANT: Genentech, Inc. 
(ii) TITLE OF INVENTION: Vertebrate Smoochened Proteins 
5' (iii) NUMBER OF SEQUENCES: 4 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Genentech, Inc. 

(B) STREET: 46 0 Point San Bruno 81 vd 

(C) CITY: South San Francisco 
10 (D) STATE: California 

(E) COUNTRY: USA 

(F) ZIP: 94080 

[v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 3.5 inch, 1.44 Mb floppy disk 
15 (E) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: WinPatin (Genentech) 

(vi) CLTIRENT APPLICATION DATA: 
(A) APPLICATION NUMBER: 
20 (B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Svoboda , Craig G. 

(B) REGISTRATION NUMBER: 39,044 

25 (C) REFERENCE/DOCKET NUMBER: P1050PCT 

(ix) TELSCOMMUITICATION INFORMATION: 

(A) TELEPHONE: 415/225-1489 

(B) TELEFAX: 415/952-9881 

(C) TELEX: 910/371-7168 

30 (2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 854 base pairs 

(B) TYPE: Nucleic Acid 

(C) STRANDEDNESS : Single 
35 (D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

GCGGCGCGCT CGCGCGGAGG TGGCTGCTGG GCCGCGGGCT GGCGTGGGGG 50 

CGGAGCCGGG GAGCGACTCC CGCACCCCAC GGCCGGTGCC TGCCCTCCAT 100 

CGAGGGGCTG GGAGTTAGTT TTAATGGTGG GAGAGGGAAT GGGGCTGAAG 150 

40 ATCGGGGCCC CAGAGGGTTC CCAGGGTTGA AGACAATTCC AATCGAGGCG 2 00 

AGGGAGTCCG GGGTCCGTGC ATCCTGGCCC GGGCCTGCGC AGCTCAACAT 250 
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GGGGCCCGGG TTCCAAAGTT TGCAAAGTTG GGAGCCGAGG GGCCCGGACG 300 

CGCGCGGCGC CTGGCGAAAG CTGGCCCCAG ACTTTCGGGG CGCACCGGTC 350 

GCCTAAGTAG CCTCCGCGGC CCCCGGGGTC GTGTGTGTGG CCAGGGGACT 400 

CCGGGGAGCT CGGGGGCGCC TCAGCTTCTG CTGAGTTGGC GGT7TGGCC 44 9 

5 ATG GCT GCT GGC CGC CCC GTG CGT GGG CCC GAG CTG GCG 4 88 
Met Ala Ala Gly Arg Pro Val Arg Gly Pro Glu Leu Ala 

1 5 10 

CCC CGG AGG CTG CTG CAG TTG CTG CTG CTG GTA CTG CTT 527 
Pro Arg Arg Leu Leu Gin Leu Leu Leu Leu Val Leu Leu 
10 15 20 25 

GGG GGC CGG GGC CGG GGG GCG GCC TTG AGC GGG AAC GTG 566 
Gly Gly Arg Gly Arg Gly Ala Ala Leu Ser Gly Asn Val 
30 35 

ACC GGG CCT GGG CCT CGC AGT GCC GGC GGG AGC GCG AGG 605 
15 Thr Gly Pro Gly Pro Arg Ser Ala Gly Gly Ser Ala Arg 
40 45 50 

AGG AAC GCG CCG GTG ACC AGC CCT CCG CCG CCG CTG CTG 644 
Arg Asn Ala Pro Val Thr Ser Pro Pro Pro Pro Leu Leu 
55 60 65 

20 AGC CAC TGC GGC CGG GCC GCC CAC TGC GAG CCT TTG CGC 68 3 
Ser His Cys Gly Arg Ala Ala His Cys Glu Pro Leu Arg 
70 75 

TAC AAC GTG TGC CTG GGC TCC GCG CTG CCC TAC GGA GCC 722 
Tyr Asn Val Cys Leu Gly Ser Ala Leu Pro Tyr Gly Ala 
25 80 85 90 

ACC ACC ACG CTG CTG GCT GGG GAC TCG GAC TCG CAG GAG 761 
Thr Thr Thr Leu Leu Ala Gly Asp Ser Asp Ser Gin Glu 
95 100 

GAA GCG CAC AGC AAG CTC GTG CTC TGG TCC GGC CTC CGG 800 
30 Glu Ala Kis Ser Lys Leu Val Leu Trp Ser Gly Leu Arg 
105 110 115 

AAT GCT CCC CGA TGC TGG GCA GTG ATC CAG CCC CTG CTG 839 
Asn Ala Pro Arg Cys Trp Ala Val lie Gin Pro Leu Leu 
120 125 130 

35 TGT GCT GTC TAC ATG CCC AAG TGT GAA AAT GAC CGA GTG 878 
Cys Ala Val Tyr Met Pro Lys Cys Glu Asn Asp Arg Val 
135 140 

GAG TTG CCC AGC CGT ACC CTC TGC CAG GCC ACC CGA GGC 917 
Glu Leu Pro Ser Arg Thr Leu Cys Gin Ala Thr Arg Gly 
40 145 150 155 

CCC TGT GCC ATT GTG GAG CGG GAA CGA GGG TGG CCT GAC 956 
Pro Cys Ala He Val Glu Arg Glu Arg Gly Trp Pro Asp 
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TTT CTG 
Phe Leu 
170 

5 CCA AAC 
Pro Asn 



CAA TGT 
Gin Cys 

iO 

AGC TGG 
Ser Trp 
210 

CAG AAC 
!5 Gin Asn 



CAC AGT 
His Ser 
235 

20 TGT ACA 
Cys Thr 



CGG AAC 
Arg Asn 

25 

GTC AAT 
Val Asn 
275 

GCC CAG 
30 Ala Gin 



CGA GCA 
Arg Ala 
300 

35 AGC GAG 
Ser Glu 



TAC TAT 
Tyr Tyr 

40 

CTC ACC 
Leu Thr 
340 



160 

CGT TGC ACG 
Arg Cys Thr 



GAG GTA CAA 
Glu Val Gin 
185 

GAA GCA CCC 
Glu Ala Pro 
200 

TAC GAG GAC 
Tyr Glu Asp 

CCG CTG TTC 
Pro Leu Phe 
225 

TAC ATC GCA 
Tyr lie Ala 



CTC TTC ACC 
Leu ?he Thr 
250 

TCC AAT CGC 
Ser Asn Arg 
265 

GCG TGT TTC 
Ala Cys Phe 



TTC ATG GAT 
Phe Met Asp 
290 

GAT GGC ACC 
Asp Gly Thr 



ACC CTA TCC 
Thr Leu Ser 
315 

GCC TTG ATG 
Ala Leu Met 
330 

TAT GCC TGG 
Tyr Ala Trp 



CCG GAC CAC 
Pro Asp His 
175 

AAC ATC AAG 
Asn lie Lys 
190 

TTG GTG AGG 
Leu Val Arg 



GTG GAG GGC 
Val Glu Gly 
215 

ACC GAG GCT 
Thr Glu Ala 



GCC TTC GGG 
Ala Phe Gly 
240 

CTG GCC ACC 
Leu Ala Thr 
255 

TAC CCT GCG 
Tyr Pro Ala 



TTT GTG GGC 
Phe Val Gly 
280 

GGT GCC CGC 
Gly Ala Arg 

ATG AGA TTT 
Met Arg Phe 
305 

TGT GTC ATC 
Cys Val lie 
320 

GCT GGA GTA 
Ala Gly Val 



CAC ACC TCC 
His Thr Ser 
345 



165 

TTC CCT GAA 
Phe Pro Glu 
180 

TTC AAC AGT 
Phe Asn Ser 



ACA GAC AAC 
Thr Asp Asn 
205 

TGT GGG ATC 
Cys Gly lie 



GAG CAC CAG 
Glu His Gin 
230 

GCG GTC ACC 
Ala Val Thr 
245 

TTT GTG GCT 
Phe Val Ala 



GTT ATT CTC 
Val lie Leu 
270 

AGC ATT GGC 
Ser lie Gly 



CGG GAG ATT 
Arg Glu He 
295 

GGG GAG CCC 
Gly Glu Pro 
310 

ATC TTT GTC 
He Phe Val 



GTG TGG TTC 
Val Trp Phe 
335 

TTC AAA GCC 
Phe Lys Ala 



GGC TGT 995 
Gly Cys 



TCA GGC 1034 
Ser Gly 
195 

CCC AAG 1073 
Pro Lys 



CAG TGC 1112 
Gin Cys 
220 

GAC ATG 1151 
Asp Met 



GGC CTC 1190 
Gly Leu 



GAC TGG 122 9 
Asp Trp 
260 

TTC TAT 126 3 
Phe Tyr 



TGG CTG 130 7 
Trp Leu 
285 

GTT TGC 1346 
Val Cys 



ACC TCC 138 5 
Thr Ser 



ATC GTG 1424 
lie Val 
325 

GTG GTC 14 6 3 
Val Val 



CTG GGC 1502 
Leu Gly 
350 



wo 98/14475 PCT/US97/17433 



ACC ACT 
Thr Thr 



CAC CTG 
5 His Leu 
365 

GCA ATC 
Ala lie 



10 AGT GGC 
Ser Gly 



CGT GCT 
Arg Ala 
15 405 

ATT GTG 
lie Val 



CTG TTC 
20 Leu Phe 
430 

GAG AAG 
Glu Lys 



25 CTG GGC 
Leu Gly 



ATC ACC 
lie Thr 
30 470 

GCT GAG 
Ala Glu 



CAA GCC 
35 Gin Ala 
495 

ATT CCT 
lie Pro 



40 GTG GAG 
Val Glu 



ATT GCC 
lie Ala 



TAC CAG CCT 
Tyr Gin Pro 
355 

CTC ACG TGG 
Leu Thr Trp 



CTT GCT GTG 
Leu Ala Val 
380 

ATC TGC TTT 
lie Cys Phe 
395 

GGC TTT GTA 
Gly Phe Val 



GGA GGC TAC 
Gly Gly Tyr 
420 

TCC ATC AAG 
Ser lie Lys 



GCA GCC AGC 
Ala Ala Ser 
445 

ATT TTT GGC 
He Phe Gly 
460 

TTC AGC TGC 
Phe Ser Cys 



TGG GAG CGT 
Trp Glu Arg 
485 

AAT GTG ACC 
Asn Val Thr 



GAT TGT GAG 
Asp Cys Glu 
510 

AAG ATC AAT 
Lys He Asn 
525 

ATG AGC ACC 
Met Ser Thr 



CTC TCG GGC 
Leu Ser Gly 



TCA CTC CCC 
Ser Leu Pro 
370 

GCT CAG GTA 
Ala Gin Val 
385 

GTA GGC TAC 
Val Gly Tyr 



CTT GCC CCA 
Leu Ala Pro 
410 

TTC CTC ATC 
Phe Leu lie 



AGC AAC CAC 
Ser Asn His 
435 

AAG ATC AAT 
Lys He Asn 
450 

TTC CTC GCC 
Phe Leu Ala 



CAC TTC TAT 
His Phe Tyr 
475 

AGC TTC CGG 
Ser Phe Arg 



ATT GGG CTG 
He Gly Leu 
500 

ATC AAG AAT 
He Lys Asn 
515 

CTG TTT GCC 
Leu Phe Ala 



TGG GTC TGG 
Trp Val Trp 



AAG ACA TCC 
Lys Thr Ser 
360 

TTC GTC CTC 
Phe Val Leu 
375 

GAT GGG GAC 
Asp Gly Asp 



AAG AAC TAT 
Lys Asn Tyr 
400 

ATT GGC CTG 
He Gly Leu 



CGA GGG GTC 
Arg Gly Val 
425 

CCT GGG CTT 
Pro Gly Leu 
440 

GAG ACC ATG 
Glu Thr Met: 



TTT GGC TTC 
Phe Gly Phe 
465 

GAC TTC TTC 
Asp Phe Phe 

GAC TAT GTG 
Asp Tyr Val 
490 

CCT ACC AAG 
Pro Thr Lys 
505 

CGG CCC AGC 
Arg Pro Ser 



ATG TTT GGC 
Met Phe Gly 
530 

ACC AAG GCC 
Thr Lys Ala 



TAT TTC 1541 
Tyr Phe 



ACT GTG 1580 
Thr Val 



TCC GTG 1S19 
Ser Val 
390 

CGG TAC 1658 
Arg Tyr 



GTG CTT 1697 
Val Leu 
415 

ATG ACT 173 6 
Met Thr 



CTG AGT 1775 
Leu Ser 



CTG CGC 1814 
Leu Arc 
455 

GTG CTC 18 53 
Val Leu 



AAC CAG 18 92 
Asn Gin 
480 

CTA TGC 1931 
Leu Cys 



AAG CCC 1970 
Lys Pro 

CTC CTG 2009 
Leu Leu 
520 

ACT GGC 204 8 
Thr Gly 



ACC CTG 2087 
Thr Leu 



-36- 



wo 98/14475 



PCTAJS97/17433 



535 

CTC ATC 
Leu lie 



5 AGT GAT 
Ser Asp 
560 

ATT GCC 
He Ala 

10 

AAC CCG 
Asn Pro 



TCC CAT 
15 Ser His 
600 

AAT GAA 
Asn Glu 



20 CAC GTC 
His Val 
625 

CCC CAG 
Pro Gin 

25 

CCA CCA 
Pre Pro 



GAG ATC 
30 Glu He 
665 

AAG AAG 
Lys Lys 



35 GGG CCA 
Gly Pro 
690 

GCC ACC 
Ala Thr 

40 

CAG AAG 
Gin Lys 



TGG AGG CGC 
Trp Arg Arc 
550 

GAT GAA CCC 
Asp Glu Pro 



AAG GCC TTC 
Lys Ala Phe 
575 

GGC CAG GAG 
Gly Gin Glu 
590 

GAT GGA CCT 
Asp Gly Pro 



CCC TCA GCT 
Pro Ser Ala 
615 

ACC AAG ATG 
Thr Lys Met 



GAT GTG TCT 
Asp Val £er 
640 

GAA GAA CAA 
Glu Glu Gin 
655 

TCC CCA GAG 
Ser Pro Glu 



CGG AGG AAG 
Arg Arg Lys 
680 

GCC CCT GAA 
Ala Pro Glu 



AGT GCA GTT 
ser Ala Val 
705 

TGC CTA GTA 
Cys Leu Val 
720 



540 

ACC TGG TGC 
Thr Trp Cys 



AAG AGA ATC 
Lys Arg He 
565 

TCT AAG CGG 
Ser Lys Arg 
580 

CTC TCC TTC 
Leu Ser Phe 



GTT GCC GGT 
Val Ala Gly 
605 

GAT GTC TCC 
Asp Val Ser 



GTG GCT CGA 
Val Ala Arg 
630 

GTC ACC CCT 
Val Thr Pro 
645 

GCC AAC CTG 
Ala Asn Leu 



TTA GAG AAG 
Leu Glu Lys 
670 

AGG AAG AAG 
Arg Lys Lys 



CTT CAC CAC 
Leu His His 

695 

CCT CGG CTG 
Pro Arg Leu 
710 

GCT GCA AAT 
Ala Ala Asn 



AGG TTG ACT 
Arg Leu Thr 
555 

AAG AAA AGC 
Lys Lys Ser 
570 

CGT GAA CTG 
Arg Glu Leu 



AGC ATG CAC 
Ser Met His 
595 

TTG GCT TTT 
Leu Ala Phe 



TCT GCC TGG 
Ser Ala Trp 
620 

AGA GGA GCC 
Arg Gly Ala 
635 

GTG GCA ACT 
Val Ala Thr 



TGG CTG GTT 
Trp Leu Val 
660 

CGT TTA GGC 
Arg Leu Gly 



GAG GTG TGC 
Glu Val Cys 
685 

TCT GCC CCT 
Ser Ala Pro 
700 

CCT CAG CTG 
Pro Gin Leu 



GCC TGG GGA 
Ala Trp Gly 
725 



545 

GGG CAC 2126 
Gly His 

AAG ATG 2165 
Lys Met 

CTG CAG 2204 
Leu Gin 
585 

ACT GTC 2243 
Thr Val 



GAA CTC 2282 
Glu Leu 
610 

GCC CAG 2321 
Ala Gin 



ATA TTA 2360 
He Leu 



CCA GTG 23 99 
Pro val 
650 

GAG GCA 2438 
Glu Ala 



CGG AAG 24 77 
Arg Lys 
675 

CCC TTG 2 516 
Pro Leu 



GTT CCT 2555 
Vai Pro 



CCT CGG 2 594 
Pro Arg 
715 

ACA GGA 263 3 
Thr Gly 
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GAG CCC TGC CGA CAG GGA GCC TGG ACT GTA GTC TCC AAC 2672 
Glu P-o Cys Arg Gin Gly Ala Trp Thr Val Val Ser Asn 
730 735 740 

CCC TTC TGC CCA GAG CCT AGT CCC CAT CAA GAT CCA TTT 2711 
S Pro Phe Cys Pro Glu Pro Ser Pro His Gin Asp Pro Phe 
745 750 

CTC CCT GGT GCC TCA GCC CCC AGG GTC TGG GCT CAG GGC 27 50 
Leu Pro Gly Ala Ser Ala Pro Arg Val Trp Ala Gin Gly 
755 760 765 

10 CGC CTC CAG GGG CTG GGA TCC ATT CAT TCC CGC ACT AAC 2789 
Arg Leu Gin Gly Leu Gly Ser He His Ser Arg Thr Asn 
770 775 780 

CTA ATG GAG GCT GAG CTC TTG GAT GCA GAC TCG GAC TTC TG 28 3 0 
Leu Met Glu Ala Glu Leu Leu Asp Ala Asp Ser Asp Phe 
15 785 790 793 

AGCTTGCAGG GCAGGTCCTA GGATGGGGAA GACAAGTGCA CGCCTTCCTA 28 80 

TAGCTCTTCC TGAGAGCACA CCTCTGGGGT CTCATCTGAC AGTCTATGGG 293 0 

CCATGTATCT GCCTACAAGA GCTGTGTAC3 ACTGGCTAGA AGCAGCCAGA 2980 

CCATAGAAAC AAGCTGAACA CAGCCACTGA TAGACCTCAC TTCAGAAGCA 3030 

20 AGACCTGCAG TTCAGGACCC TTGCCTCTGC CCCCCAATTA GAGTCTGGCT 30 8 0 

GGCAGTGTTA GTCTCCAACA GAGCTTGTAC TAGGGTAGGA ACGGCAGAGG 3130 

CAGGGGTGAT GGTACCCAGA GTGGGCTGGG GTGTCCAGTG AGGTAACCAA 3180 

GCCCATGTCT GGCAGATGAG GGCTGGCTGC CCTTTTCTGT GCCAATGAGT 3230 

GCCCTTTTCT GGCGCTCTGA GACCAAAAGT GTTTATTGTG TCATTTGTCC 3280 

25 TTTTTCTAGG TGGGAACAGG ACTCTCTTTT TCCTCTTCCT GGTAGTTGTA 3 3 30 

ATGACTACTC CCATAAGGCC TAGAACTGCT CTCAGTAGGT GGCCCTGTCC 33 80 

AAAACACATC TTCACATCTT AGTTCCACTA GGCCAAACTC TTATTGGTTA 3430 

GCACCTTAAA ACACACACAC ACACACACAC ACACACACAC ACACACACAC 34 80 

ACACACACAC ACCCTCTTAC TTCTGAGCTT GGTCTCAAGA GAGAGACAAC 3 530 

30 TGGTTCAGCT CCAGGCCTCT GAGAGTCATG TTTTCTTCCT CACATCCATC 3 5 80 

CAGTGGGGAT GGATCCTCTG ACTTAAGGGG CTACCTTGGG AAGCCTCTGT 3 630 

AGCTTCAGCC AGGCAAGAAA GCTTCTTCCA ACTTCTGTAT CTGGTGGGAA 3680 

GGAGGACTCC CTACTTTTTA CAATGTCTAG TCATTTTCAT ACTGCCCCAC" 3 730 

ATTCAAGAAC CAGACAGCAG GATGCCTTAG AAGCTGGCTG GGTTCCAGGT 3780 
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CAGAGGCTCA GTATGAGAAG AAGAAATATG AACAGTAAAT AAAACATTTT 3 830 

TGTATAAAAA AAAAAAAAAA AAAA 3 8 54 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 7 93 amino acids 

(B) TYPE: Amino Acid 
(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Ala Ala Gly Arg Pro Val Arg Gly Pro Glu Leu Ala Pro Arg 

10 1 5 10 15 

Arg Leu Leu Gin Leu Leu Leu Leu Val Leu Leu Gly Gly Arg Gly 

20 25 30 

Arg Gly Ala Ala Leu Ser Gly Asn Val Thr Gly Pro Gly Pro Arg 

35 40 45 

15 Ser Ala Gly Gly Ser Ala Arg Arg Asn Ala Pro Val Thr Ser Pro 

50 55 60 

Pro Pro Pro Leu Leu Ser His Cys Gly Arg Ala Ala His Cys Glu 

65 70 75 

Pro Leu Arg Tyr Asn Val Cys Leu Gly Ser Ala Leu Pro Tyr Gly 

20 30 85 90 

Ala Thr Thr Thr Leu Leu Ala Gly Asp Ser Asp Ser Gin Glu Glu 

95 100 105 

Ala His Ser Lys Leu Val Leu Trp Ser Gly Leu Arg Asn Ala Pro 

110 115 120 

25 Arg Cys Trp Ala Val lie Gin Pro Leu Leu Cys Ala Val Tyr Met 

125 130 135 

Pro Lys Cys Glu Asn Asp Arg Val Glu Leu Pro Ser Arg Thr Leu 

140 145 150 

Cys Gin Ala Thr Arg Gly Pro Cys Ala lie Val Glu Arg Glu Arg 

30 155 160 165 

Gly Trp Pro Asp Phe Leu Arg Cys Thr Pro Asp His Phe Pro Glu 

170 175 180 

Gly Cys Pro Asn Glu Val Gin Asn lie Lys Phe Asn Ser Ser Gly 

185 190 195 

35 Gin Cys Glu Ala Pro Leu Val Arg Thr Asp Asn Pro Lys Ser Trp 

200 205 210 

Tyr Glu Asp Val Glu Gly Cys Gly lie Gin Cys Gin Asn Pro Leu 

215 220 225 
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Phe Thr Glu Ala Glu His Gin Asp Met His Ser Tyr lie Ala Ala 
230 235 240 

Phe Gly Ala Val Thr Gly Leu Cys Thr Leu Phe Thr Leu Ala Thr 
245 250 255 

5 Phe Val Ala Asp Trp Arg Asn Ser Asn Arg Tyr Pro Ala Val He 
260 265 270 

Leu Phe Tyr Val Asn Ala Cys Phe Phe Val Gly Ser He Gly Trp 
275 280 285 

Leu Ala Gin Phe Met Asp Gly Ala Arg Arg Glu He Val Cys Arg 
10 290 295 300 

Ala Asp Gly Thr Met Arg Phe Gly Glu Pro Thr Ser Ser Glu Thr 
305 310 315 

Leu Ser Cys Val He He Phe Val He Val Tyr Tyr Ala Leu Met 
320 325 330 

15 Ala Gly Val Val Trp Phe Val Val Leu Thr Tyr Ala Trp His Thr 

335 340 345 

Ser Phe Lys Ala Leu Gly Thr Thr Tyr Gin Pro Leu Ser Gly Lys 
350 355 360 

Thr Ser Tyr Phe His Leu Leu Thr Trp Ser Leu Pro Phe Val Leu 
20 365 370 375 

Thr Val Ala He Leu Ala Val Ala Gin Val Asp Gly Asp Ser Val 
380 385 390 

Ser Gly He Cys Phe Val Gly Tyr Lys Asn Tyr Arg Tyr Arg Ala 
395 400 405 

25 Gly Phe Val Leu Ala Pro He Gly Leu Val Leu He Val Gly Gly 

410 415 420 

Tyr Phe Leu He Arg Gly Val Met Thr Leu Phe Ser He Lys Ser 
425 430 435 

Asn His Pro Gly Leu Leu Ser Glu Lys Ala Ala Ser Lys He Asn 
30 440 445 450 

Glu Thr Met Leu Arg Leu Gly He Phe Gly Phe Leu Ala Phe Gly 
455 460 465 

Phe Val Leu He Thr Phe Ser Cys His Phe Tyr Asp Phe Phe Asn 
470 475 480 

35 Gin Ala Glu Trp Glu Arg Ser Phe Arg Asp Tyr Val Leu Cys Gin 

485 490 495 

Ala Asn Val Thr He Gly Leu Pro Thr Lys Lys Pro He Pro Asp 
500 505 510 

Cys Glu He Lys Asn Arg Pro Ser Leu Leu Val Glu Lys He Asn 
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515 520 525 

Leu Phe Ala Met Phe Gly Thr Gly He Ala Met Ser Thr Trp Vai 

530 535 540 

Trp Thr Lys Ala Thr Leu Leu He Trp Arg Arg Thr Trp Cys Arg 

5 " 545 550 555 

Leu Thr Gly His Ser Asp Asp Glu Pro Lys Arg He Lys Lys Ser 

560 565 570 

Lys Met He Ala Lys Ala Phe Ser Lys Arg Arg Glu Leu Leu Gin 

575 580 585 

10 Asn Pro Gly Gin Glu Leu Ser Phe Ser Met His Thr Val Ser His 

590 595 600 

Asp Gly Pro Val Ala Gly Leu Ala Phe Glu Leu Asn Glu Pro Ser 

605 610 615 

Ala Asp Val Ser Ser Ala Trp Ala Gin Kis Val Thr Lys' Met Val 

15 620 625 630 

Ala Arg Arg Gly Ala He Leu Pro Gin Asp Val Ser Val Thr Pro 

635 640 645 

Val Ala Thr Pro Val Pro Pro Glu Glu Gin Ala Asn Leu Trp Leu 

650 655 660 

20 Val Glu Ala Glu He Ser Pro Glu Leu Glu Lys Arg Leu Gly Arg 

665 670 675 

Lys Lys Lys Arg Arg Lys Arg Lys Lys Glu Val Cys Pro Leu Gly 

630 685 690 

Pro Ala Pro Glu Leu His His Ser Ala Pro Val Pro Ala Thr Ser 

25 695 700 705 

Ala Val Pro Arg Leu Pro Gin Leu Pro Arg Gin Lys Cys Leu Val 

710 715 720 

Ala Ala Asn Ala Trp Gly Thr Gly Glu Pro Cys Arg Gin Gly Ala 

725 730 735 

30 Trp Thr Val Val Ser Asn Pro Phe Cys Pro Glu Pro Ser Pro His 

740 745 750 

Gin Asp Pro Phe Leu Pro Gly Ala Ser Ala Pro Arg Val Trp Ala 

755 760 765 

Gin Gly Arg Leu Gin Gly Leu Gly Ser He His Ser Arg Thr Asn 

35 770 775 780 

Leu Met Glu Ala Glu Leu Leu Asp Ala Asp Ser Asp Phe 

785 790 793 

(2) INFORMATION FOR SEQ ID NO: 3: 
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20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 972 base pairs 

(B) TYPE: Nucleic Acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

CG3GGGTTGG CC ATG GCC GCT GCC CGC CCA GCG CGG GGG 39 
Met Ala Ala Ala Arg Pro Ala Arg Gly 



40 



1 



5 



CCG GAG CTC CCG CTC CTG GGG CTG CTG CTG CTG CTG CTG 78 
Pro Glu Leu Pro Leu Leu Gly Leu Leu Leu Leu Leu Leu 



10 



. _. « ^^/^ -f. r*r^ r**^n TiiP T17 

CTG GGG GAC CCG GGC CGG GGG UCU i^u «v=w 

Leu Gly ASP Pro Gly Arg Gly Ala Ala Ser Ser Gly Asn 



15 25 , 30 



GCG ACC GGG CCT GGG CCT CGG AGC GCG GGC GGG AGC GCG 156 
Ala Thr Gly Pro Gly Pro Arg Ser Ala Gly Gly Ser Ala 
40 45 

AGG AGG AGC GCG GCG GTG ACT GGC CCT CCG CCG CCG CTG 195 
Arc Arg Ser Ala Ala Val Thr Gly Pro Pro Pro Pro Leu 
50 55 60 

AGC C^C TGC GGC CGG GCT GCC CCC TGC GAG CCG CTG CGC 234 
ser Kis Cys Gly Arg Ala Ala Pro Cys Glu Pro Leu Arg 



65 



70 



75 TAC AAC GTG TGC CTG GGC TCG GTG CTG CCC TAC GGG GCC 2 73 
Tyr Asn Val Cys Leu Gly Ser Val Leu Pro Tyr Gly Ala 

Q C 



75 



A'-C TCC ACA CTG CTG GCC GGA GAC TCG GAC TCC CAG GAG 312 
Thr ser Thr Leu Leu Ala Gly Asp Ser Asp Ser Gin Glu 



30 90 95 



GAA GCG CAC GGC AAG CTC GTG CTC TGG TCG GGC CTC CGG 351 
Glu Ala His Gly Lys Leu Val Leu Tr? Ser Gly Leu Arg 

110 



105 



AAT GCC CCC CGC TGC TGG GCA GTG ATC CAG CCC CTG CTG 390 
Asn Ala Pro Arg Cys Trp Ala Val He Gin Pro Leu Leu 
115 120 125 

TGT GCC GTA TAC ATG CCC AAG TGT GAG AAT GAC CGG GTG 429 
cys Ala Val Tyr Met Pro Lys Cys Glu Asn Asp Arg Val 
130 135 

GAG CTG CCC AGC CGT ACC CTC TGC CAG GCC ACC CGA GGC 468 
Glu Leu pro ser Arg Thr Leu Cys Gin Ala Thr Arg Gly 

150 



140 



145 



CCC TGT GCC ATC GTG GAG AGG GAG CGG GGC TGG CCT GAC 507 
Pro cys Ala lie Val Glu Arg Glu Arg Gly Trp Pro Asp 
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TTC CTG 
Phe Leu 



5 ACG AAT 
Thr Asn 
180 

CAG TGC 
Gin Cys 

10 

AGC TGG 
Ser Trp 
205 

CAG AAC 
15 Gin Asn 



CAC AGC 
His Ser 



20 TGC ACG 
Cys Thr 
245 

CGG AAC 
Arg Asn 

25 

GTC AAT 
Val Asn 
270 

GCC CAG 
30 Ala Gin 



CGT GCA 
Arg Ala 



35 AAT GAG 
Asn Glu 
310 

TAG TAG 
Tyr Tyr 

40 

CTC ACC 
Leu Thr 
335 



155 

CGC TGC ACT 
Arg Cys Thr 
170 

GAG GTG CAG 
Glu Val Gin 



GAA GTG CCC 
Glu Val Pro 
195 

TAG GAG GAG 
Tyr Glu Asp 



CCG CTC TTC 
Pro Leu Phe 
220 

TAG ATC GCG 
Tyr lie Ala 
235 

CTC TTC ACC 
Leu Phe Thr 



TCG AAT CGC 
Ser Asn Arg 
260 

GCG TGC TTC 
Ala Cys Phe 



TTC ATG GAT 
Phe Met Asp 
285 

GAT GGC ACC 
Asp Gly Thr 
300 

ACT CTG TGC 
Thr Leu Ser 



GCC CTG ATG 
Ala Leu Met 
325 

TAT GCC TGG 
Tyr Ala Trp 



160 

CCT GAC CGC 
Pro Asp Arg 

AAC ATC AAG 
Asn lie Lys 
185 

TTG GTT CGG 
Leu Val Arg 



GTG GAG GGC 
Val Glu Gly 
210 

ACA GAG GCT 
Thr Glu Ala 
225 

GCC TTC GGG 
Ala Phe Gly 



CTG GCC ACA 
Leu Ala Thr 
250 

TAG CCT GCT 
Tyr Pro Ala 



TTT GTG GGC 
Phe Val Gly 
275 

GGT GCC CGC 
Gly Ala Arg 
290 

ATG AGG CTT 
Met Arg Leu 



TGC GTC ATC 
Cys Vai lie 
315 

GCT GGT GTG 
Ala Gly Val 



CAC ACT TCC 
His Thr Ser 
340 



TTC CCT GAA 
Phe Pro Glu 
175 

TTC AAC AGT 
Phe Asn Ser 



ACA GAC AAC 
Thr Asp Asn 
200 

TGC GGC ATC 
Cys Gly He 
215 

GAG CAC CAG 
Glu His Gin 



GCC GTC ACG 
Ala Val Thr 
240 

TTC GTG GCT 
Phe Val Ala 



GTT ATT CTC 
Val He Leu 
265 

AGC ATT GGC 
Ser He Gly 
230 

CGA GAG ATC 
Arg Glu lie 



GGG GAG CCC 
Gly Glu Pro 
305 

ATC TTT GTC 
He Phe Val 



GTT TGG TTT 
Val Trp Phe 
330 

TTC AAA GCC 
Phe Lys Ala 
345 



165 

GGC TGC 546 
Gly Cys 



TCA GGC 58 5 
Ser Gly 
190 

CCC AAG 624 
Pre Lys 



CAG TGC 663 
Gin Cys 



GAC ATG 702 
Asp Met 
230 

GGC CTC 741 
Gly Leu 



GAC TGG 78 0 
Asp Trp 
255 

TTC TAG 819 
?he Tyr 

TGG CTG 8 58 
Trp Leu 

GTC TGC 897 
Val Cys 
295 

ACC TCC 93 6 
Thr Ser 



ATC GTG 975 
He Val 
320 

GTG GTC 1014 
Val Val 



CTG GGC 1053 
Leu Gly 
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ACC ACC TAG CAG CCT CTC TCG GGC AAG ACC TCC TAC TTC 1092 
Thr Thr Tyr Gin Pro Leu Ser Gly Lys Thr Ser Tyr Phe 
350 355 360 

CAC CTG CTC ACC TGG TCA. CTC CCC TTT GTC CTC ACT GTG 1131 
His Leu Leu Thr Trp Ser Leu Pro Phe Val Leu Thr Val 
365 370 

GCA ATC CTT GCT GTG GCG CAG GTG GAT GGG GAC TCT GTG 1170 
Ala He Leu Ala Val Ala Gin Val Asp Gly Asp Ser Val 
375 380 385 

AGT GGC ATT TGT TTT GTG GGC TAC AAG AAC TAC CGA TAC 1209 
Ser Gly He Cys Phe Val Gly Tyr Lys Asn Tyr Arg Tyr 
390 395 



CGT GCG GGC TTC GTG CTG GCC CCA ATC GGC CTG GTG CTC 1248 
Arg Ala Gly Pne val Leu Ala Pro He Gly Leu Val Lsu 
15 400 405 410 

ATC GTG GGA GGC TAC TTC CTC ATC CGA GGA GTC ATG ACT 128 7 
He Val Gly Gly Tyr Phe Leu He Arg Gly Val Met Thr 
415 420 425 

CTG TTC TCC ATC AAG AGC AAC CAC CCC GGG CTG GTG AGT 13 2 6 
20 Leu Phe Ser He Lys Ser Asn His Pro Gly Leu Leu Ser 

430 435 

GAG AAG GCT GCC AGC AAG ATC AAC GAG ACC ATG CTG CGC 13 65 
Glu Lys Ala Ala Ser Lys He Asn Glu Thr Met Leu Arg 
440 445 450 

25 CTG GGC ATT TTT GGC TTC CTG GCC TTT GGC TTT GTG CTC 1404 
Leu Gly He Phe Gly Phe Leu Ala Phe Gly Phe Val Leu 
455 460 

ATT ACC TTC AGC TGC CAC TTC TAC GAC TTC TTC AAC CAG 1443 
He Thr Phe Ser Cys His Phe Tyr Asp Phe Phe Asn Gin 
30 465 470 475 

GCT GAG TGG GAG CGC AGC TTC CGG GAC TAT GTG CTA TGT 1482 
Ala Glu Trp Glu Arg Ser Phe Arg Asp Tyr Val Leu Cys 
480 485 490 

CAG GCC AAT GTG ACC ATC GGG CTG CCC ACC AAG CAG CCC 1521 
35 Gin Ala Asn Val Thr He Gly Leu Pro Thr Lys Gin Pro 

495 500 

ATC CCT GAC TGT GAG ATC AAG AAT CGC CCG AGC CTT CTG 15 60 
He Pro Asp Cys Glu He Lys Asn Arg Pro Ser Leu Leu 
505 510 515 

40 GTG GAG AAG ATC AAC CTG TTT GCC ATG TTT GGA ACT GGC 15 99 
Val Glu Lys He Asn Leu Phe Ala Met Phe Gly Thr Gly 
520 525 

ATC GCC ATG AGC ACC TGG GTC TGG ACC AAG GCC ACG CTG 163 8 
He Ala Met Ser Thr Trp Val Trp Thr Lys Ala Thr Leu 
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530 

CTC ATC 
Leu lie 



5 AGT GAC 
Ser Asp 



ATT GCC 
He Ala 
10 570 

AAC CCA 
Asn Pro 



TCC CAC 
15 Ser His 
595 

AAT GAG 
Asn Glu 



20 CAT GTC 
His Val 



CCC CAG 
Pro Gin 
25 635 

CCC CCA 
Pro Pro 



GAG ATC 
30 Glu He 
660 

AAG AAG 
Lys Lys 

35 GCG CCG 
Ala Pro 



AGT ACC 
Ser Thr 
40 700 

TGC CTG 
Cys Leu 



TGG AGG CGT 
Trp Arg Arg 
545 

GAT GAG CCA 
Asp Glu Pro 
560 

AAG GCC TTC 
Lys Ala ?he 



GGC CAG GAG 
Gly Gin Glu 
585 

GAC GGG CCC 
Asp Gly Pro 



CCC TCA GCT 
Pro Ser Ala 
610 

ACC AAG ATG 
Thr Lys Met 
625 

GAT ATT TCT 
Asp He Ser 



GAG GAA CAA 
Glu Glu Gin 
650 

TCC CCA GAG 
Ser Pro Glu 



AGG AGG AAG 
Arg Arg Lys 
675 

CCC CCT GAG 
Pro Pro Glu 
690 

ATT CCT CGA 
He Pro Arg 



GTG GCT GCA 
Val Ala Ala 
715 



535 

ACC TGG TGC 
Thr Trp Cys 
550 

AAG CGG ATC 
Lys Arg He 



TCT AAG CGG 
Ser Lys Arg 
575 

CTG TCC TTC 
Leu Ser Phe 



GTG GCG GGC 
Val Ala Gly 
600 

GAT GTC TCC 
Asp Val Ser 
615 

GTG GCT CGG 
Val Ala Arg 



GTC ACC CCT 
Val Thr Pro 
640 

GCC AAC CTG 
Ala Asn Leu 



CTG CAG AAG 
Leu Gin Lys 
665 

AGG AAG AAG 
Arg Lys Lys 
680 

CTT CAC CCC 
Leu His Pro 



CTG CCT CAG 
Leu Pro Gin 
705 

GGT GCC TGG 
Gly Ala Trp 



540 

AGG TTG ACT 
Arg Leu Thr 



AAG AAG AGC 
Lys Lys Ser 
565 

CAC GAG CTC 
His Glu Leu 



AGC ATG CAC 
Ser Met Kis 
590 

TTG GCC TTT 
Leu Ala Phe 
605 

TCT GCC TGG 
Ser Ala Trp 



AGA GGA GCC 
Arg Gly Ala 
630 

GTG GCA ACT 
Val Ala Thr 



TGG CTG GTT 
Trp Leu Val 
655 

CGC CTG GGC 
Arg Leu Gly 
670 

GAG GTG TGC 
Glu Val Cys 



CCT GCC CCT 
Pro Ala Pro 
695 

CTG CCC CGG 
Leu Pro Arg 



GGA GCT GGG 
Gly Ala Gly 
720 



GGG CAG 1677 
Gly Gin 
555 

AAG ATG 1716 
Lys Met 



CTG CAG 175 5 
Leu Gin 
580 

ACT GTG 17 94 
Thr Val 



GAC CTC 18 3 3 
Asp Leu 

GCC CAG 1872 
Ala Gin 
620 

ATA CTG 1911 
He Leu 



CCA GTG 1950 
Pro Val 
645 

GAG GCA 1989 
Glu Ala 



CGG AAG 2028 
Arg Lys 

CCG CTG 2067 
Pro Leu 
685 

GCC CCC 2106 
Ala Pro 



CAG AAA 2145 
Gin Lys 
710 

GAC TCT 2184 
Asp Ser 
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TGC CGA CAG GGA GCG TGG ACC CTG GTC TCC AAC CCA TTC 2223 
Cys Arg Gin Gly Ala Trp Thr Leu Val Ser Asn Pro Phe 
' 735 



725 



730 



TGC CCA GAG CCC AGT CCC CCT CAG GAT CCA TTT CTG CCC 2262 
5 cys pro Glu Pro Ser Pro Pro Gin Asp Pro Phe Leu Pro 
740 74S ■'SO 

AGT GCA CCG GCC CCC GTG GCA TGG GCT CAT GGC CGC CGA 2301 
Ser Ala Pro Ala Pro Val Ala Tr? Ala His Gly Arg Arg 

755 ■'SO 

10 CAG GGC CTG GGG CCT ATT CAC TCC CGC ACC AAC CTG ATG 2340 
Gin Gly Leu Gly Pro He Kis Ser Arg Thr Asn Leu Met 
765 7'70 

GAC ACA GAA CTC ATG GAT GCA GAC TCG GAC TTC TGAGCCT 2380 
Asp Thr Glu Leu Met Asp Ala Asp Ser Asp Phe 
780 ■?85 787 

GCAGAGCAGG ACCTGGGACA GGAAAGAGAG GAACCAATAC CTTCAAGGCT 2430 

CTTCTTCCTC ACCGAGCATG CTTCCCTAGG ATCCCGTCTT CCAGAGAACC 2480 

TGTGGGCTGA CTGCCCTCCG AAGAGAGTTC TGGATGTCTG GCTCAAAGCA 2530 

GCAGGACTGT GGGAAAGAGC CTAACATCTC CATGGGGAGG CCTCACCCCA 2580 

20 GGGACAGGGC CCTGGAGCTC AGGGTCCTTG TTTCTGCCCT GCCAGCTGCA 2630 

GCCTGGTTGG CAGCATCTGC TCCATCGGGG CAGGGGGTAT GCAGAGCTTG 2680 

TGGTGGGGCA GGAACGGTGG AGGCAGAGGT GACAGTTCCC AGAGTGGGCT 2 730 

TTGGTGGCCA GGGAGGCAGC CTAGCCTATG TCTGGCAGAT GAGGGCTGGC 2780 

TGCCGTTTTC TGGGCTGATG GGTGCCCTTT CCTGGCAGTC TCAGTCCAAA 2830 

25 AGTGTTGACT GTGTCATTAG TCCTTTGTCT AAGTAGGGCC AGGGCACCGT 2380 

ATTCCTCTCC CAGGTGTTTG TGGGGCT3GA AGGACCTGCT CCCACAGGGG 2930 

CCATGTCCTC TCTTAATAGG TGGCACTACC CCAAACCCAC CG 2972 

(2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 787 amino acids 

(B) TYPE: Amino Acid 
(D) TOPOLOGY: Linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

Met Ala Ala Ala Arg Pro Ala Arg Gly Pro Glu Leu Pro Leu Leu 

35 1 5 

Gly Leu Leu Leu Leu Leu Leu Leu Gly Asp Pro Gly Arg Gly Ala 
- 25 30 



20 
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Ala Ser Ser Gly Asn Ala Thr Gly Pro Gly Pro Arg Ser Ala Gly 

35 40 45 

Gly Ser Ala Arg Arg Ser Ala Ala Val Thr Gly Pro Pro Pro Pro 

50 55 60 

5 Leu Ser His Cys Gly Arg Ala Ala Pro Cys Glu Pro Leu Arg Tyr 

S5 70 75 

Asn Val Cys Leu Gly Ser Val Leu Pro Tyr Gly Ala Thr Ser Thr 

80 85 90 

Leu Leu Ala Gly Asp Ser Asp Ser Gin Glu Glu Ala His Gly Lys 

10 95 100 105 

Leu Val Leu Trp Ser Gly Leu Arg Asn Ala Pro Arg Cys Trp Ala 

110 115 120 

Val lie Gin Pro Leu Leu Cys Ala Val Tyr Met Pro Lys Cys Glu 

125 130 135 

15 Asn Asp Arg Val Glu Leu Pro Ser Arg Thr Leu Cys Gin Ala Thr 

140 145 150 

Arg Gly Pro Cys Ala lie Val Glu Arg Glu Arg Gly Trp Pro Asp 

155 160 165 

Fhe Leu Arg Cys Thr Pro Asp Arg Phe Pro Glu Gly Cys Thr Asn 

20 170 175 180 

Glu Val Gin Asn He Lys Phe Asn Ser Ser Gly Gin Cys Glu Val 

185 190 195 

Pro Leu Val Arg Thr Asp Asn Pro Lys Ser Trp Tyr Glu Asp Val 

200 205 210 

25 Glu Gly Cys Gly He Gin Cys Gin Asn Pro Leu Phe Thr Glu Ala 

215 220 225 

Glu His Gin Asp Met His Ser Tyr lie Ala Ala Phe Gly Ala Val 

230 235 240 

Thr Gly Leu Cys Thr Leu Phe Thr Leu Ala Thr Phe Val Ala Asp 

30 245 250 255 

Trp Arg Asn Ser Asn Arg Tyr Pro Ala Val He Leu Phe Tyr Val 

260 265 270 

Asn Ala Cys Phe Phe Val Gly Ser He Gly Trp Leu Ala Gin Phe 

275 280 285 

35 Met Asp Gly Ala Arg Arg Glu He Val Cys Arg Ala Asp Gly Thr 

290 295 300 

Met Arg Leu Gly Glu Pro Thr Ser Asn Glu Thr Leu Ser Cys Val 

305 310 315 

lie He Phe Val He Val Tyr Tyr Ala Leu Met Ala Gly Val Val 
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320 



325 330 



Trp Phe Vai Val Leu Thr Tyr Ala Trp His Thr Ser Phe Lys Ala 
335 340 345 

Leu Gly Thr Thr Tyr Gin Pro Leu Ser Gly Lys Thr Ser Tyr Phe 
5 350 355 360 

His Leu Leu Thr Tro Ser Leu Pro Phe Val Leu Thr Val Ala He 
365 370 375 

Leu Ala Val Ala Gin Val Asp Gly Asp Ser Val Ser Gly He Cys 
380 385 390 

10 Phe Val Gly Tyr Lys Asn Tyr Arg Tyr Arg Ala Gly Phe Val Leu 

395 400 405 

Ala Pro He Gly Leu Val Leu He Val Gly Gly Tyr Phe Leu He 
410 415 420 

Ara Glv Val Met Thr Leu Phe Ser He Lys Ser Asn His Pro Gly 
15 ^ ' 425 430 435 

Leu Leu Ser Glu Lys Ala Ala Ser Lys He Asn Glu Thr Met Leu 
440 445 450 

Arg Leu Gly He Phe Gly Phe Leu Ala Phe Gly Phe Val Leu He 
455 460 465 

-70 Thr Phe Ser Cys His Phe Tyr Asp Phe Phe Asn Gin Ala Glu Trp 

470 475 480 

Glu Arg ser Phe Arg Asp Tyr Val Leu Cys Gin Ala Asn Val Thr 
485 490 495 

He Gly Leu Pro Thr Lys Gin Pro He Pro Asp Cys Glu He Lys 
25 500 505 510 

Asn Arg Pro Ser Leu Leu Val Glu Lys He Asn Leu Phe Ala Met 
515 520 525 

Phe Gly Thr Gly He Ala Met Ser Thr Trp Val Trp Thr Lys Ala 
530 535 540 

30 Thr Leu Leu He Trp Arg Arg Thr Trp Cys Arg Leu Thr Gly Gin 

545 550 555 

Se- AsD Asp Glu Pro Lys Arg He Lys Lys Ser Lys Met He Ala 
560 565 570 

Lys Ala Phe Ser Lys Arg His Glu Leu Leu Gin Asn Pro Gly Gin 
35 575 580 585 

Glu Leu Ser Phe Ser Met His Thr Val Ser His Asp Gly Pro Val 
590 595 600 

Ala Gly Leu Ala Phe Asp Leu Asn Glu Pro Ser Ala Asp Val Ser 
605 ' 610 615 
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Ser Ala Trp Ala Gin His Val Thr Lys Met Val Ala Arg Arg Gly 

620 625 630 

Ala He Leu Pro Gin Asp He Ser Val Thr Pro Val Ala Thr Pro 

635 640 645 

5 Val Pro Pro Glu Glu Gin Ala Asn Leu Trp Leu Val Glu Ala Glu 

650 655 660 

He Ser Pro Glu Leu Gin Lys Arg Leu Gly Arg Lys Lys Lys Arg 

665 670 675 

Arg Lys Arg Lys Lys Glu Val Cys Pro Leu Ala Pro Pro Pro Glu 

10 680 685 690 

Leu His Pro Pro Ala Pro Ala Pro Ser Thr He Pro Arg Leu Pro 

695 700 705 

Gin Leu Pro Arg Gin Lys Cys Leu Val Ala Ala Gly Ala Trp Gly 

710 715 720 

15 Ala Gly Asp Ser Cys Arg Gin Gly Ala Trp Thr Leu Val Ser Asn 

725 730 735 

Pro Phe Cys Pro Glu Pro Ser Pro Pro Gin Asp Pro Phe Leu Pro 

740 745 750 

Ser Ala Fro Ala Pro Val Ala Trp Ala His Gly Arg Arg Gin Gly 

20 755 760 755 

Leu Gly Pro He His Ser Arg Thr Asn Leu Met Asp Thr Glu Leu 

770 775 780 

Met Asp Ala Asp Ser Asp Phe 
785 787 
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WHAT IS CLAIMED IS: 

1 . Isolated vertebrate Smoothened. 

2. Isolated venebratc Smoothened having at least about 80% sequence identity with native 
sequence venebrate Smoothened comprising amino acid residues 1 to 787 of SEQ ID N0:4. 

5 3. The vertebrate Smoothened of claim 2 wherein said Smoothened has at least about 90% 

sequence identity. 

4. The vertebrate Smoothened of claim 3 wherein said Smoothened has at least about 95% 
sequence identity. 

5. Isolated native sequence vertebrate Smoothened comprising the amino acid sequence of SEQ 
10 1DN0:4. 

6. Isolated native sequence vertebrate Smoothened comprising the amino acid sequence of SEQ 
ID N0:2. 

7. A chimeric molecule comprising the vertebrate Smoothened of claim 1 fused to a 
heterologous amino acid sequence. 

15 8. The chimeric molecule of claim 7 wherein said heterologous amino acid sequence is an 

epitope lag sequence. 

9. An antibody which specifically binds to the venebrate Smoothened of claim 1 . 

10. The antibody of claim 9 wherein said antibody is a monoclonal antibody. 

1 1 . The antibody of claim 9 which is a neutralizing antibody. 
20 12. The antibody of claim 9 which is an agonist antibody. 

13. Isolated nucleic acid encoding venebrate Smoothened. 

14. The nucleic acid of claim 13 wherein said nucleic acid encodes native sequence vertebrate 
Smoothened comprising the amino acid sequence of SEQ ID N0:4. 

15. The nucleic acid of claim 13 wherein said nucleic acid encodes native sequence vertebrate 
25 Smoothened comprising the amino acid sequence of SEQ ID N0:2. 

16. A vector comprising the nucleic acid of claim 13. 

17. The vector of claim 16 operabiy linked to control sequences recognized by a host cell 
transformed with the vector. 

1 8. A host cell comprising the vector of claim 1 6. 

30 19. A process of using a nucleic acid molecule encoding vertebrate Smoothened to effect 

production of vertebrate Smoothened comprising culturing the host cell of claim 18. 

20. The process of claim 19 further comprising recovering the venebrate Smoothened from the 
host cell culture. 

21. An article of manufacture, comprising a container and a composition contained within said 
35 container, wherein the composition includes vertebrate Smoothened or vertebrate Smoothened antibodies. 

22. The article of manufacture of claim 2 1 further comprising instructions for using the vertebrate 
Smoothened or vertebrate Smoothened antibodies in vivo or ex vivo. 

23. A non-human, transgenic animal which contains cells that express nucleic acid encoding 
vertebrate Smoothened. 
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24. The animal of claim 23 which is a mouse or rat. 

25. A non-human, knockout animal which contains cells having an altered gene encoding 
venebraie Smoothened. 

26. The animal of claim 25 which is a mouse or rat. 

5 27. A protein complex comprising vertebrate Smoothened protein and vertebrate Patched protein. 

28. The protein complex of claim 27 further comprising vertebrate Hedgehog protein. 

29. The protein complex of claim 28 wherein the vertebrate Hedgehog protein binds to the 
vertebrate Patched protein but does not bind to the vertebrate Smoothened protein. 

30. The protein complex of claim 27 which is a receptor for vertebrate Hedgehog protein. 
10 31. A vertebrate Patched which binds to vertebrate Smoothened. 

32. The vertebrate Patched of claim 3 1 which has less than 1 00% sequence identity with a native 
sequence vertebrate Patched. 
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GCGGCGCGCT CGCGCGGAGG TGGCTGCTGG GCCGCGGGCT GGCGTGGGGG 50 

CGGAGCCGGG GAGCGACTCC CGCACCCCAC GGCCGGTGCC TGCCCTCCAT 100 

CGAGGGGCTG GGAGTTAGTT TTAATGGTGG GAGAGGGAAT GGGGCTGAAG 150 

ATCGGGGCCC CAGAGGGTTC CCAGGGTTGA AGACAATTCC AATCGAGGCG 200 

AGGGAGTCCG GGGTCCGTGC ATCCTGGCCC GGGCCTGCGC AGCTCAACAT 250 

GGGGCCCGGG TTCCAAAGTT TGCAAAGTTG GGAGCCGAGG GGCCCGGACG 300 

CGCGCGGCGC CTGGCGAAAG CTGGCCCCAG ACTTTCGGGG CGCACCGGTC 350 

GCCTAAGTAG CCTCCGCGGC CCCCGGGGTC GTGTGTGTGG CCAGGGGACT 400 

CCGGGGAGCT CGGGGGCGCC TCAGCTTCTG CTGAGTTGGC GGTTTGGCC 449 

ATG OCT GCT GGC CGC CCC GTG CGT GGG CCC GAG CTG GCG 488 
Met Ala Ala Gly Arg Pro Val Arg Gly Pro Glu Leu Ala 
1 5 10 

CCC CGG AGG CTG CTG CAG TTG CTG CTG CTG GTA CTG CTT 527 
Pro Arg Arg Leu Leu Gin Leu Leu Leu Leu Val Leu Leu 
15 20 25 

GGG GGC CGG GGC CGG GGG GCG GCC TTG AGO GGG AAC GTG 566 
Gly Gly Arg Gly Arg Gly Ala Ala Leu Ser Gly Asn Val 
30 35 

ACC GGG CCT GGG CCT CGC AGT GCC GGC GGG AGC GCG AGG 605 
Thr Gly Pro Gly Pro Arg Ser Ala Gly Gly Ser Ala Arg 
40 45 50 

AGG AAC GCG CCG GTG ACC AGC CCT CCG CCG CCG CTG CTG 644 
Arg Asn Ala Pro Val Thr Ser Pro Pro Pro Pro Leu Leu 
55 60 ^5 

AGC CAC TGC GGC CGG GCC GCC CAC TGC GAG CCT TTG CGC 683 
Ser His Cys Gly Arg Ala Ala His Cys Glu Pro Leu Arg 
70 75 

TAC AAC GTG TGC CTG GGC TCC GCG CTG CCC TAC GGA GCC 722 
Tyr Asn Val Cys Leu Gly Ser Ala Leu Pro Tyr Gly Ala 
80 85 90 

ACC ACC ACG CTG CTG GCT GGG GAC TCG GAC TCG CAG GAG 761 
Thr Thr Thr Leu Leu Ala Gly Asp Ser Asp Ser Gin Glu 
95 100 

GAA GCG CAC AGC AAG CTC GTG CTC TGG TCC GGC CTC CGG 800 
Glu Ala His Ser Lys Leu Val Leu Trp Ser Gly Leu Arg 
105 110 115 
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AAT GCT CCC CGA TGC TGG GCA GTG ATC CAG CCC CTG CTG 839 
Asn Ala Pro Arg Cys Trp Ala Val lie Gin Pro Leu Leu 
120 125 130 

TGT GCT GTC TAG ATG CCC AAG TGT GAA AAT GAC CGA GTG 878 
cys Ala Val Tyr Met Pro Lys Cys Glu Asxi Asp Arg Val 
135 140 

GAG TTG CCC AGC CGT ACC CTC TGC CAG GCC ACC CGA GGC 917 
Glu Leu Pro Ser Arg Thr Leu Cys Gin Ala Thr Arg Gly 
145 150 155 

CCC TGT GCC ATT GTG GAG CGG GAA CGA GGG TGG CCT GAC 956 
Pro Cys Ala lie Val Glu Arg Glu Arg Gly Trp Pro Asp 
160 165 

TTT CTG CGT TGC ACG CCG GAC CAC TTC CCT GAA GGC TGT 995 
Phe Leu Arg Cys Thr Pro Asp His Phe Pro Glu Gly Cys 
170 175 180 

CCA AAC GAG GTA CAA AAC ATC AAG TTC AAC AGT TCA GGC 1034 
Pro Asn Glu Val Gin Asn He Lys Phe Asn Ser Ser Gly 
185 190 195 

CAA TGT GAA GCA CCC TTG GTG AGG ACA GAC AAC CCC AAG 1073 
Gin Cys Glu Ala Pro Leu Val Arg Thr Asp Asn Pro Lys 
200 205 

AGC TGG TAG GAG GAC GTG GAG GGC TGT GGG ATC CAG TGC 1112 
Ser Trp Tyr Glu Asp Val Glu Gly Cys Gly He Gin Cys 
210 215 220 

CAG AAC CCG CTG TTC ACC GAG GCT GAG CAC CAG GAC ATG 1151 
Gin Asn Pro Leu Phe Thr Glu Ala Glu His Gin Asp Met 
225 230 

CAC AGT TAC ATC GCA GCC TTC GGG GCG GTC ACC GGC CTC 1190 
His Ser Tyr He Ala Ala Phe Gly Ala Val Thr Gly Leu 
235 240 245 

TGT ACA CTC TTC ACC CTG GCC ACC TTT GTG GCT GAC TGG 1229 
Cys Thr Leu Phe Thr Leu Ala Thr Phe Val Ala Asp Trp 
250 255 260 

CGG AAC TCC AAT CGC TAC CCT GCG GTT ATT CTC TTC TAT 1268 
Arg Asn Ser Asn Arg Tyr Pro Ala Val He Leu Phe Tyr 
265 270 

GTC AAT GCG TGT TTC TTT GTG GGC AGC ATT GGC TGG CTG 1307 
Val Asn Ala Cys Phe Phe Val Gly Ser He Gly Trp Leu 
275 280 285 
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TAC TAT GCC TTG ATG GCT GGA GTA GTG TGG TTC GTG GTC 1463 
Tyr Tyr Ala Leu Met Ala Gly Val Val Trp Phe Val Val 
330 335 

CTC ACC TAT GCC TGG CAC ACC TCC TTC AAA GCC CTG GGC 1502 
Leu Thr Tyr Ala Trp His Thr Ser Phe Lys Ala Leu Gly 
340 345 350 

ACC ACT TAC CAG CCT CTC TCG GGC AAG ACA TCC TAT TTC 1541 
Thr Thr Tyr Gin Pro Leu Ser Gly Lys Thr Ser Tyr Phe 

^ ^ w — - - 

CAC CTG CTC ACG TGG TCA CTC CCC TTC GTC CTC ACT GTG 1580 
His Leu Leu Thr Trp Ser Leu Pro Phe Val Leu Thr Val 
365 370 375 

GCA ATC CTT GCT GTG GCT CAG GTA GAT GGG GAC TCC GTG 1619 
Ala lie Leu Ala Val Ala Gin Val Asp Gly Asp Ser Val 
380 385 390 

AGT GGC ATC TGC TTT GTA GGC TAC AAG AAC TAT CGG TAC 1658 
Ser Gly lie Cys Phe Val Gly Tyr Lys Asn Tyr Arg Tyr 
395 400 

CGT GCT GGC TTT GTA CTT GCC CCA ATT GGC CTG GTG CTT 1697 
Arg Ala Gly Phe Val Leu Ala Pro lie Gly Leu Val Leu 
405 410 415 

ATT GTG GGA GGC TAC TTC CTC ATC CGA GGG GTC ATG ACT 1736 
lie Val Gly Gly Tyr Phe Leu He Arg Gly Val Met Thr 
420 425 

CTG TTC TCC ATC AAG AGC AAC CAC CCT GGG CTT CTG AGT 1775 
Leu Phe Ser He Lys Ser Asn His Pro Gly Leu Leu Ser 
430 435 440 

GAG AAG GCA GCC AGC AAG ATC AAT GAG ACC ATG CTG CGC 1814 
Glu Lys Ala Ala Ser Lys He Asn Glu Thr Met Leu Arg 
445 450 455 

CTG GGC ATT TTT GGC TTC CTC GCC TTT GGC TTC GTG CTC 1853 
Leu Gly He Phe Gly Phe Leu Ala Phe Gly Phe Val Leu 
460 465 

ATC ACC TTC AGC TGC CAC TTC TAT GAC TTC TTC AAC CAG 1892 
He Thr Phe Ser Cys His Phe Tyr Asp Phe Phe Asn Gin 
470 475 480 

GCT GAG TGG GAG CGT AGC TTC CGG GAC TAT GTG CTA TGC 1931 
Ala Glu Trp Glu Arg Ser Phe Arg Asp Tyr Val Leu Cys 
485 490 
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CAA GCC AAT GTG ACC ATT GGG CTG CCT ACC AAG AAG CCC 1970 
Gin Ala Asn Val Thr lie Gly Leu Pro Thr Lys Lys Pro 
495 500 505 

ATT CCT GAT TGT GAG ATC AAG AAT CGG CCC AGC CTC CTG 2009 
lie Pro Asp Cys Glu lie Lys Asn Arg Pro Ser Leu Leu 
510 515 520 

GTG GAG AAG ATC AAT CTG TTT GCC ATG TTT GGC ACT GGC 2048 
Val Glu Lys He Asn Leu Phe Ala Met Phe Gly Thr Gly 
525 530 

ATT GCC ATG AGC ACC TGG GTC TGG ACC AAG GCC ACC CTG 2087 
He Ala Met Ser Thr Trp Val Trp Thr Lys Ala Thr Leu 
535 540 545 

CTC ATC TGG AGG CGC ACC TGG TGC AGG TTG ACT GGG CAC 2126 
Leu He Trp Arg Arg Thr Trp Cys Arg Leu Thr Gly His 
550 555 

AGT GAT GAT GAA CCC AAG AGA ATC AAG AAA AGC AAG ATG 2165 
Ser Asp Asp Glu Pro Lys Arg He Lys Lys Ser Lys Met 
560 565 570 

ATT GCC AAG GCC TTC TCT AAG CGG CGT GAA CTG CTG CAG 2204 
He Ala Lys Ala Phe Ser Lys Arg Arg Glu Leu Leu Gin 
575 580 585 

AAC CCG GGC CAG GAG CTC TCC TTC AGC ATG CAC ACT GTC 2243 
Asn Pro Gly Gin Glu Leu Ser Phe Ser Met His Thr Val 
590 595 

TCC CAT GAT GGA CCT GTT GCC GGT TTG GCT TTT GAA CTC 2282 
Ser His Asp Gly Pro Val Ala Gly Leu Ala Phe Glu Leu 
600 605 610 

AAT GAA CCC TCA GCT GAT GTC TCC TCT GCC TGG GCC CAG 2321 
Asn Glu Pro Ser Ala Asp Val Ser Ser Ala Trp Ala Gin 
615 620 

CAC GTC ACC AAG ATG GTG GCT CGA AGA GGA GCC ATA TTA 2360 
His Val Thr Lys Met Val Ala Arg Arg Gly Ala He Leu 
625 630 635 

CCC CAG GAT GTG TCT GTC ACC CCT GTG GCA ACT CCA GTG 2399 
Pro Gin Asp Val Ser Val Thr Pro Val Ala Thr Pro Val 
640 645 650 

CCA CCA GAA GAA CAA GCC AAC CTG TGG CTG GTT GAG GCA 2438 
Pro Pro Glu Glu Gin Ala Asn Leu Trp Leu Val Glu Ala 
655 660 
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GAG ATC TCC CCA GAG TTA GAG AAG CGT TTA GGC CGG AAG 2477 
Glu lie Ser Pro Glu Leu Glu Lys Arg Leu Gly Arg Lys 
665 670 675 

AAG AAG CGG AGG AAG AGG AAG AAG GAG GTG TGC CCC TTG 2516 
Lys Lys Arg Arg Lys Arg Lys Lys Glu Val Cys Pro Leu 
680 685 

GGG CCA GCC CCT GAA CTT CAC CAC TCT GCC CCT GTT CCT 2555 
Gly Pro Ala Pro Glu Leu His His Ser Ala Pro Val Pro 
690 695 700 

GCC ACC AGT GCA GTT CCT CGG CTG CCT CAG CTG CCT CGG 2594 
Ala Thr Ser Ala Val Pro Arg Leu Pro Gin Leu Pro Arg 
705 710 715 

CAG AAG TGC CTA GTA GCT GCA AAT GCC TGG GGA ACA GGA 2633 
Gin Lys Cys Leu Val Ala Ala Asn Ala Trp Gly Thr Gly 
720 725 

GAG CCC TGC CGA CAG GGA GCC TGG ACT GTA GTC TCC AAC 2672 
Glu Pro Cys Arg Gin Gly Ala Trp Thr Val Val Ser Asn 
730 735 740 

CCC TTC TGC CCA GAG CCT AGT CCC CAT CAA GAT CCA TTT 2711 
Pro Phe Cys Pro Glu Pro Ser Pro His Gin Asp Pro Phe 
745 750 

CTC CCT GGT GCC TCA GCC CCC AGG GTC TGG GCT CAG GGC 2750 
Leu Pro Gly Ala Ser Ala Pro Arg Val Trp Ala Gin Gly 
755 760 765 

CGC CTC CAG GGG CTG GGA TCC ATT CAT TCC CGC ACT AAC 2789 
Arg Leu Gin Gly Leu Gly Ser He His Ser Arg Thr Asn 
770 775 780 

CTA ATG GAG GCT GAG CTC TTG GAT GCA GAC TCG GAC TTC TG 2830 
Leu Met Glu Ala Glu Lau Leu Asp Ala Asp Ser Asp Phe 
785 790 793 

AGCTTGCAGG GCAGGTCCTA GGATGGGGAA GACAAGTGCA CGCCTTCCTA 2880 

TAGCTCTTCC TGAGAGCACA CCTCTGGGGT CTCATCTGAC AGTCTATGGG 2930 

CCATGTATCT GCCTACAAGA GCTGTGTACG ACTGGCTAGA AGCAGCCAGA 2980 

CCATAGAAAC AAGCTGAACA CAGCCACTGA TAGACCTCAC TTCAGAAGCA 3030 

AGACCTGCAG TTCAGGACCC TTGCCTCTGC CCCCCAATTA GAGTCTGGCT 3080 

GGCAGTGTTA GTCTCCAACA GAGCTTGTAC TAGGGTAGGA ACGGCAGAGG 3130 

CAGGGGTGAT GGTACCCAGA GTGGGCTGGG GTGTCCAGTG AGGTAACCAA 3180 
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GCCCATGTCT GGCAGATGAG GGCTGGCTGC CCTTTTCTGT GCCAATGAGT 3230 
GCCCTTTTCT GGCGCTCTGA GACCAAAAGT GTTTATTGTG TCATTTGTCC 3280 
TTTTTCTAGG TGGGAACAGG ACTCTCTTTT TCCTCTTCCT GGTAGTTGTA 3330 
ATGACTACTC CCATAAGGCC TAGAACTGCT CTCAGTAGGT GGCCCTGTCC 3380 
AAAACACATC TTCACATCTT AGTTCCACTA GGCCAAACTC TTATTGGTTA 3430 
GCACCTTAAA ACACACACAC ACACACACAC ACACACACAC ACACACACAC 3480 
ACACACACAC ACCCTCTTAC TTCTGAGCTT GGTCTCAAGA GAGAGACAAC 3530 
TGGTTCAGCT CCAGGCCTCT GAGAGTCATG TTTTCTTCCT CACATCCATC 3580 
CAGTGGGGAT GGATCCTCTG ACTTAAGGGG CTACCTTGGG AAGCCTCTGT 3 630 
AGCTTCAGCC AGGCAAGAAA GCTTCTTCCA ACTTCTGTAT CTGGTGGGAA 3680 
GGAGGACTCC CTACTTTTTA CAATGTCTAG TCATTTTCAT AGTGCCCCAC 3730 
ATTCAAGAAC CAGACAGCAG GATGCCTTAG AAGCTGGCTG GGTTCCAGGT 3780 
CAGAGGCTCA GTATGAGAAG AAGAAATATG AACAGTAAAT AAAACATTTT 3830 
TGTATAAAAA AAAAAAAAAA AAAA 3854 
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1 CGGGGGTTGG CCATGGCCGC TGCCCGCCCA GCGCGGGGGC CGGAGCTCCC 
SSSSIcC GGTACCGGCG ACGGGCGGGT CGCGCCCCCG GCCTCGAGGG 
1 MetAlaAl aAlaArgPro AlaArgGlyP roGluLeuPr 

Met 

51 GCTCCTGGGG CTGCTGCTGC TGCTGCTGCT GGGGGACCCG GGCCGGGGGG 
S^GACGACG ACGACGACGA CCCCCTGGGC CCGGCCCCCC 
14 oLeuLeuGly LauLeuLeuL euLeuLeuLe uGlyAspPro GlyArgGlyAla 

101 CGGCCTCGAG CGGGAACGCG ACC6GGCCTG GGCCTCGGA6 CGCGGGCGGG 
^-SSSi? r-??rTTGCGC TGGCCCGGAC CCGGAGCCTC GCGCCCGCCC 
31 AlaSerSe rGlyAsnAla ThrGlyProG lyProArgSe rAlaGlyOly 

151 AGCGCGAGGA GGAGCGCGGC GGTGACTGGC CCTCCGCCGC CGCTGAGCCA 
TCGCGCTCCT CCTCGCGCCG CCACTGACCG GGAGGCGGCG GCGACTCGGT 
47 SerAlaArgA rgSerAlaAl aValThrGly ProProProP roLeuSarHxs 

9 01 CTGCGGCCGG GCTGCCCCCT GCGAGCCGCT GCGCTACAAC GTGTGCCTGG 

?Sacggggga CGCTCGGCGA cgcgatgttg cacacggacc 
64 CyaGlyArg AlaAlaProC ysGluProLe uArgTyrAsn ValCysLeuG 

251 GCTCGGTGCT GCCCTACGGG GCCACCTCCA CACTGCTGGC CGGAGACTCG 
SIg?Scga ?gggatgccc CGGTGGAGGT GTGACGACCG GCCTCTGAGC 
81 lySerValLe uProTyrGly AlaThrSerT hrLeuLeuAl aGlyAspSer 

301 GACTCCCAGG AGGAAGCGCA CGGCAAGCTC GTGCTCTGGT CGGGCCTCCG 
??SISotcc TCCTTCGCGT GCCGTTCGAG CACGAGACCA GCCCGGAGGC 
97 SpSerGlnG luGluAlaHi sGlyLysLeu valLeuTrpS erGlyLeuAr 

3 51 GAATGCCCCC CGCTGCTGGG CAGTGATCCA GCCCCTGCTG TGTGCCGTAT 
S^IcSgSg GCGACGACCC GTCACTAGGT CGGGGACGAC ACACGGCATA 
114 gAsnAlaPro ArgCysTrpA laVallleGl nProLeuLeu CysAlaValTyr 

AOl ACATGCCCAA GTGTGAGAAT GACCGGGTGG AGCTGCCCAG ccgtaccctc 
cIcACTCTTA CTGGCCCACC TCGACGGGTC GGCATGGGAG 
131 MetProLy sCysGluAsn AspArgValG luLeuProSe rArgThrLeu 

451 TGCCAGGCCA CCCGAGGCCC CTGTGCCATC GTGGAGAGGG A6CGGGGCTG 

aSSccggt gggctccggg gacacggtag cacctctccc TCGCCCCC^C 
147 c5SInAlaT hrArgGlyPr oCysAlalle ValGluArgG luArgGlyTrp 

■501 GCCTGACTTC CTGCGCTGCA CTCCT6ACCG CTTCCCTGAA GGCTGCACGA 

S?SLcGT GAGGACTGGC GAAGGGACO^ CCGA^^^ 
164 ProAspPhe LeuArgCysT hrProAspAr gPheProGlu GlyCysThrA 

551 ATGAGGTGCA GAACATCAAG TTCAACAGTT CAGGCCAGTG CGAAGTGCCC 
TACTCCACGT CTTGTAGTTC AAGTTGTCAA GTCCGGTCAC GCTTCACGGG 
181 snGluValGl nAsnlleLys PheAsnSerS erGlyGlnCy sGluValPro 

601 TTGGTTCGGA CAGACAACCC CAAGAGCTGG TACGAGGACG TGGAGGGCTG 
SSSS? GTCTGTTGGG GTTCTCGACC ATGCTCCTGC ACCTCCCGAC 
197 LeuValArgT hrAspAanPr oLyaSerTrp TyrGluAspV alGluGlyCy 
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651 CGGCATCCAG TGCCAGAACC CGCTCTTCAC AGAGGCTGAG CACCAGGACA 
GCCGTAGGTC ACGGTCTTGG GCGAGAAGTG TCTCCGACTC GTGGTCCTGT 
214 sGlylleGln CysGlaAsaP roLeuPheTh rGluAlaGlu HisGlnAspMet 

701 TGCACAGCTA CATCGCGGCC TTCGGGGCCG TCACGGGCCT CTGCACGCTC 
ACGTGTCGAT GTAGCGCCGG AAGCCCCGGC AGTGCCCGGA GACGTGCGAG 
231 HisSerTy rlleAlaAla PheGlyAlaV alThrGlyLe uCysThrLou 

751 TTCACCCTGG CCACATTCGT GGCTGACTGG CGGAACTCGA ATCGCTACCC 
AAGTGGGACC GGTGTAAGCA CCGACTGACC GCCTTGAGCT TAGCGATGGG 
247 PheThrliOxiA laThrPheVa lAlaAspTrp ArgAsnSerA snArgTyrPro 

801 TGCTGTTATT CTCTTCTACG TCAATGCGTG CTTCTTTGTG GGCAGCATTG 
ACGACAATAA GAGAAGATGC AGTTACGCAC GAAGAAACAC CCGTCGTAAC 
264 AlaVallle LeuPheTyrV alAsnAlaCy sPhePheVal GlySerlleG 

start clone 14 

851 GCTGGCTGGC CCAGTTCATG GATGGTGCCC GCCGAGAGAT CGTCTGCCGT 
CGACCGACCG GGTCAAGTAC CTACCACGGG CGGCTCTCTA GCAGACGGCA 
281 lyTrpLeuAl aGlnPheMet AspGlyAlaA rgArgGluIl eValCysArg 

901 GCAGATGGCA CCATGAGGCT TGGGGAGCCC ACCTCCAATG AGACTCTGTC 
CGTCTACCGT GGTACTCCGA ACCCCTCGGG TGGAGGTTAC TCTGAGACAG 
297 AlsLAspGlyT hrMetArgLe uGlyGluPro ThrSerAsnG luThrLeuSe 

951 CTGCGTCATC ATCTTTGTCA TCGTGTACTA CGCCCTGATG GCTGGTGTGG 
GACGCAGTAG TAGAAACAGT AGCACATGAT GCGGGACTAC CGACCACACC 
314 rCysVallle IlePheVall leValTyrTy rAlaLeuMet AlaGlyValVal 

1001 TTTGGTTTGT GGTCCTCACC TATGCCTGGC ACACTTCCTT CAAAGCCCTG 
AAACCAAACA CCAGGAGTGG ATACGGACCG TGTGAAGGAA GTTTCGGGAC 
331 TrpPheVa IValLeuThr TyrAlaTrpH isThrSerPh eLysAlaLeu 

1051 GGCACCACCT ACCAGCCTCT CTCGGGCAAG ACCTCCTACT TCCACCTGCT 
CCGTGGTGGA TGGTCGGAGA GAGCCCGTTC TGGAGGATGA AGGTGGACGA 
347 GlyThrThrT yrGlnProLe uSerGlyLys ThrSerTyrP heHisIiexiLou 

1101 CACCTGGTCA CTCCCCTTTG TCCTCACTGT GGCAATCCTT GCTGTGGCGC 
GTGGACCAGT GAGGGGAAAC AGGAGTGACA CCGTTAGGAA CGACACCGCG 
364 ThrTrpSer LeuProPheV alLeuTlirVa lAlalleLou AlaValAlaG 

1151 AGGTGGATGG GGACTCTGTG AGTGGCATTT GTTTTGTGGG CTACAAGAAC 
TCCACCTACC CCTGAGACAC TCACCGTAAA CAAAACACCC GATGTTCTTG 
3 81 InValAspGl yAspSerVal SerGlylleC ysPheValGl yTyrLysAsn 

1201 TACCGATACC GTGCGGGCTT CGTGCTGGCC CCAATCGGCC TGGTGCTCAT 
ATGGCTATGG CACGCCCGAA GCACGACCGG GGTTAGCCGG ACCACGAGTA 
397 TyrArgTyrA rgAlaGlyPh eValLeiiAla ProIleGlyL euValLeuIl 

1251 CGTGGGAGGC TACTTCCTCA TCCGAGGAGT CATGACTCTG TTCTCCATCA 
GCACCCTCCG ATGAAGGAGT AGGCTCCTCA GTACTGAGAC AAGAGGTAGT 
414 eValGlyGly TyrPheLeuI leArgGlyVa iMetThxLeu PheSerlleLys 
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1301 
431 

1351 
447 

1401 



1451 
481 
1501 
497 
1551 
514 
1601 
531 
1651 
547 
1701 
564 
1751 
581 
1801 
597 



«.sc«ca. ccccGcocTO =T^=H^ ^gSSS? SJSS^ 
^sSSI -SiSIS ?SS»laSe ..v^IleAsn 

TOCSCCTGGG ^TTT«GG= TTC^GC^ ™a=«™T 

^.TACC «C.CCTCCC A^JCO^ C^CTT^ ««=^ 
CTTCCG^C T.TCTCCT.T OT«»«^ T-OAC^TC 
«^CCA CCA^CAJ^ ^TCCCTgC TCTOJOAT^ 

SSoSS :?t^P J^^^--- 

SAGCCTTCTG STGGAGAAGA TCJACCIGTT 

ClV.«-VX.e 

.CSCO^T^O C«CTOGCT= TS^C»^ C«CGCtC« C.^*« 
*«.Sfe ~vS uXl,.««, 

C=T*CCI«T GCMGITGAC TGGG^ g^ISS ^G^SI 

SS5«1UP r,..s«,..e 

^OAIGATTO C^iSSn ^^SS SSSSSS 

T,fJSS Z"^^^ ^"-^ "^'^^"-^ 

T«:«A«:CO AGOCCAGSAG CT^^ ISSS 

SS^l "SSisT. .V.lS«Hi» 

OACGGGCCCG TGGCGGG«T GG^C Cr^« 
CIGCCCGGGC ACCGCCCrafc S„isn01uP roS.rAlaM 



S^gS-v SXS. t:;:i=nGluP »S.r.l.« 

e51 TGTCTCCTCT GCCTGGGCCC AG^GT^ C«gTGG» GC^^ 

JS^SS SSSS SS^^SS ?SSS«vaX A.a«,«,Gl. 

^ end clone 5 

501 GAGCCATACT GCCCOAGGAT ATTTCTCTCA CCCCTGIGGC JJ^^^ 

I =^SSfeS =SS Iti^T ^ 
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1951 CCCCCAGAGG AACAAGCCAA CCTGTGGCTG GTTGAGGCAG AGATCTCCCC 
GGGGGTCTCC TTGTTCGGTT GGACACCGAC CAACTCCGTC TCTAGAGGGG 
647 ProProGluG luGlnAlaAs nLeuTrpLeu ValGluAlaG luIleSerPro 

2001 AGAGCTGCAG AAGCGCCTGG GCCGGAAGAA GAAGAGGAGG AAGAGGAAGA 
TCTCGACGTC TTCGCGGACC CGGCCTTCTT CTTCTCCTCC TTCTCCTTCT 
664 GluLeuGln LysArgLeuG lyArgLysLy sLysArgArg LysArgLysL 

2051 AGGAGGTGTG CCCGCTGGCG CCGCCCCCTG AGCTTCACCC CCCTGCCCCT 
TCCTCCACAC GGGCGACCGC GGCGGGGGAC TCGAAGTGGG GGGACGGGGA 
681 ysGluValCy sProLouAla ProProProG luLeuHisPr oProAlaPro 

2101 GCCCCCAGTA CCATTCCTCG ACTGCCTCAG CTGCCCCGGC AGAAATGCCT 
CGGGGGTCAT GGTAAGGAGC TGACGGAGTC GACGGGGCCG TCTTTACGGA 
697 AlaProSerT hrlleProAr gLeuProGln LeuProArgG InLysCysLe 

2151 GGTGGCTGCA GGTGCCTGGG GAGCTGGGGA CTCTTGCCGA CAGGGAGCGT 
CCACCGACGT CCACGGACCC CTCGACCCCT GAGAACGGCT GTCCCTCGCA 
714 uValAlaAla GlyAlaTrpG lyAlaGlyAs pSerCysArg GlnGlyAlaTrp 

2201 GGACCCTGGT CTCCAACCCA TTCTGCCCAG AGCCCAGTCC CCCTCAGGAT 
CCTGGGACCA GAGGTTGGGT AAGACGGGTC TCGGGTCAGG GGGAGTCCTA 
731 ThrLeuVa ISerAsnPro PheCysProG luProSerPr oProGlnAsp 

2251 CCATTTCTGC CCAGTGCACC GGCCCCCGTG GCATGGGCTC ATGGCCGCCG 
GGTAAAGACG GGTCACGTGG CCGGGGGCAC CGTACCCGAG TACCGGCGGC 
747 ProPheLeuP roSerAlaPr oAlaProVal AlaTrpAlaH isGlyArgArg 

2301 ACAGGGCCTG GGGCCTATTC ACTCCCGCAC CAACCTGATG GACACAGAAC 
TGTCCCGGAC CCCGGATAAG TGAGGGCGTG GTTGGACTAC CTGTGTCTTG 
764 GlnGlyLeu GlyProIleH isSerArgTh rAsnLeiiMet AspThrGluL 

2351 TCATGGATGC AGACTCGGAC TTCTGAGCCT GCAGAGCAGG ACCTGGGACA 
AGTACCTACG TCTGAGCCTG AAGACTCGGA CGTCTCGTCC TGGACCCTGT 
781 ouMetAspAl aAspSerAsp Phe 

Stop 

2401 GGAAAGAGAG GAACCAATAC CTTCAAGGCT CTTCTTCCTC ACCGAGCATG 
CCTTTCTCTC CTTGGTTATG GAAGTTCCGA GAAGAAGGAG TGGCTCGTAC 

2451 CTTCCCTAGG ATCCCGTCTT CCAGAGAACC TGTGGGCTGA CTGCCCTCCG 
GAAGGGATCC TAGGGCAGAA GGTCTCTTGG ACACCCGACT GACGGGAGGC 

2501 AAGAGAGTTC TGGATGTCTG GCTCAAAGCA GCAGGACTGT GGGAAAGAGC 
TTCTCTCAAG ACCTACAGAC CGAGTTTCGT CGTCCTGACA CCCTTTCTCG 

2551 CTAACATCTC CATGGGGAGG CCTCACCCCA GGGACAGGGC CCTGGAGCTC 
GATTGTAGAG GTACCCCTCC GGAGTGGGGT CCCTGTCCCG GGACCTCGAG 

2601 AGGGTCCTTG TTTCTGCCCT GCCAGCTGCA GCCTGGTTGG CAGCATCTGC 
TCCCAGGAAC AAAGACGGGA CGGTCGACGT CGGACCAACC GTCGTAGACG 

FIG.-4D 
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2951 TGGCACTACC CCAAACCCAC CG 
ACCGTGATGG GGTTTGGGTG GC 



FIG..4E 
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